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N  Abstract 

Active-set  quadratic  programming  (QP)  methods  use  a  working  set  to  de¬ 
fine  the  search  direction  and  multiplier  estimates.  In  the  method  proposed  by 
Fletcher  in  1971,  and  in  several  subsequent  mathematically  equivalent  methods, 
the  working  set  is  chosen  to  control  the  inertia  of  the  reduced  Hessian,  which 
is  never  permitted  to  have  more  than  one  nonpositive  eigenvalue.  (We  call 
such  methods  inertia-controlling.)  This  paper  presents  an  overview  of  a  generic 
inertiarcont rolling  QP  method,  including  the  equations  satisfied  by  the  search 
direction  when  the  reduced  Hessian  is  positive  definite,  singular  and  indefinite. 
Recurrence  relations  are  derived  that  define  the  search  direction  and  Lagrange 
multiplier  vector  through  equations  related  to  the  Karush-Kuhn-Tucker  system. 

We  also  discuss  connections  with  inertia-controlling  methods  that  maintain  an 
explicit  factorization  of  the  reduced  Hessian  matrix,  ^ _ _ 

1.  Introduction 

The  quadratic  programming  (QP)  problem  is  to  minimize  a  quadratic  objective  func¬ 
tion  subject  to  linear  constraints  on  the  variables.  The  linear  constraints  may  include 
an  arbitrary  mixture  of  equality  and  inequality  constraints,  where  the  latter  may  be 
subject  to  lower  and/or  upper  bounds.  Many  mathematically  equivalent  formula¬ 
tions  are  possible,  and  the  choice  of  form  often  depends  on  the  context.  For  example, 
in  large-scale  quadratic  programs,  it  can  be  algorithmically  advantageous  to  assume 
that  the  constraints  are  posed  in  “standard  form”,  in  which  all  general  constraints 
are  equalities,  and  the  only  inequalities  are  simple  upper  and  lower  bounds  on  the 
variables  (see,  for  example.  Gill  et  al.  [GMSW87,GMSW88]). 


*The  material  in  this  report  is  based  upon  research  supported  by  the  U.S.  Department  of  Energy 
grant  DEi-FG03-87ER25030,  the  National  Science  Foundation  grant  ECS-8715153,  and  the  Office 
of  Naval  Research  contract  N00014-87-K-0142. 
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To  simplify  the  notation  in  this  paper,  we  consider  only  general  lower-bound 
inequality  constraints;  however,  the  methods  to  be  described  can  be  generalized  to 
treat  all  forms  of  linear  constraints.  The  quadratic  program  to  be  solved  is  thus 

minimize  V>(x)  =  c^x  -|-  \x^Hx 

ze»"6»"  2  (1.1) 

subject  to  Ax  >  /?, 

where  the  Hessian  matrix  H  is  symmetric,  and  A  is  an  X  n  matrix.  Any  point 
X  satisfying  Ax  >  /3  is  said  to  be  feasible.  The  gradient  of  (p  is  the  linear  function 
g{x)  =  c  +  Hx.  When  //  is  known  to  be  positive  definite,  (1.1)  is  called  a  convex 
QP;  when  H  may  be  any  symmetric  matrix,  (1.1)  is  said  to  be  a  general  QP. 

This  paper  has  two  main  purposes:  first,  to  present  an  overview  of  the  theoretical 
properties  of  a  certain  class  of  active-set  methods  for  general  quadratic  programs; 
and  second,  to  specify  the  equations  and  recurrence  relations  satisfied  by  the  search 
direction  and  Lagrange  multipliers.  At  each  iteration  of  an  active-set  method,  a  cer¬ 
tain  subset  of  the  constraints  (the  working  set)  is  of  central  importance.  The  defini¬ 
tive  feature  of  the  class  of  methods  considered  (which  we  call  inertia-controlling)  is 
that  the  strategy  for  choosing  the  working  set  ensures  that  the  reduced  Hessian  with 
respect  to  the  working  set  (see  Section  2.3)  never  has  more  than  one  nonpositive 
eigenvalue.  In  contrast,  certain  methods  for  general  quadratic  programming  allow 
any  number  of  nonpositive  eigenvalues  in  the  reduced  Hessian — for  example,  the 
methods  of  Murray  [Mur71]  and  Bunch  and  Kaufman  [BK80]. 

To  our  knowledge,  Fletcher’s  method  [Fle71]  was  the  first  inertia-controlling 
quadratic  programming  method,  and  is  derived  using  the  partitioned  inverse  of  the 
Karush-Kuhn- Tucker  matrix  (see  Sections  2.3  and  5.1).  His  original  paper  and  sub¬ 
sequent  book  [FleSl]  discuss  many  of  the  properties  to  be  considered  here.  The  meth¬ 
ods  of  Gill  and  Murray  (GM78]  and  of  QPSOL  [GMSW84c]  are  inertia-controlling 
methods  in  which  the  search  direction  is  obtained  from  the  Cholesky  factorization  of 
the  reduced  Hessian  matrix.  Gould  [Gou86]  proposes  an  inertia-contrdling  method 
intended  for  sparse  problems,  based  on  updating  certain  LU  factorizations.  Finally, 
the  Schur-complement  QP  methods  of  Gill  et  at  [GMSW84b,GMSW87,GMSW88] 
are  designed  mainly  for  sparse  problems,  particularly  those  that  arise  in  applying 
Newton-based  sequential  quadratic  programming  (SQP)  methods  to  large  nonlin- 
early  constrained  problems. 

Under  certain  conditions,  inertia-controlling  methods  and  the  methods  of  Murray 
[Mur71]  and  Bunch  and  Kaufman  [BK80]  generate  identical  iterates.  If  the  Hessian 
happens  to  be  positive  definite,  the  same  sequence  of  iterates  is  also  generated 
by  a  wide  class  of  methods  for  convex  QP  (see,  e.g.,  Cottle  and  Djang  [CD79)). 
Despite  these  theoretical  similarities,  inertia-controlling  methods  are  important  in 
their  own  right  because  of  the  useful  algorithmic  properties  that  follow  when  the 
reduced  Hessian  has  at  most  one  nonpositive  eigenvalue.  In  particular,  the  system 
of  equations  that  defines  the  search  direction  has  the  same  structure  regardless  of 
the  eigenvalues  of  the  reduced  Hessian;  this  consistency  allows  certain  factorizations 
to  be  recurred  efficiently  (see  Section  6). 

We  shall  consider  only  primal-feasible  QP  methods,  which  require  an  initial  fea¬ 
sible  point  xo,  and  thereafter  generate  a  sequence  {z/t}  of  feasible  approximations 
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to  the  solution  of  (1.1).  If  the  feasible  region  of  (1.1)  is  non-empty,  a  feasible  point 
to  initiate  the  QP  iterations  can  always  be  found  by  solving  a  linear  programming 
problem  in  which  the  (piecewise  linear)  sum  of  infeasibilities  is  minimized.  (This 
procedure  constitutes  the  feasibility  phase,  and  will  not  be  discussed  here;  for  de¬ 
tails,  see,  e.g..  Gill  et  al.  [GMSW85].)  Despite  our  restriction,  it  should  be  noted 
that  an  inertia-controlling  strategy  of  imposing  an  explicit  limit  on  the  number  of 
nonpositive  eigenvalues  of  the  reduced  Hessian  c<in  be  applied  in  QP  methods  that 
do  not  require  feasibility  at  every  iteration  (e.g.,  in  the  method  of  Hoyle  (Hoy86]). 

Before  proceeding,  we  emphasize  that  any  discussion  of  QP  methods  should  dis¬ 
tinguish  between  theoretical  and  computational  properties.  Even  if  methods  are 
based  on  mathematically  identical  definitions  of  the  iterates,  their  performance  in 
practice  depends  on  the  efficiency,  storage  requirements  and  stability  of  the  associ¬ 
ated  numerical  procedures.  Various  mathematical  equivalences  among  QP  methods 
are  discussed  in  Cottle  and  Djang  [CD79]  and  Best  [Bes84].  In  the  present  paper, 
Sections  2-4  are  concerned  primarily  with  theory,  and  Sections  5-6  treat  computa¬ 
tional  matters. 

2.  Inertia-Controlling  Active-Set  Methods 
2.1.  Optimality  conditions 

The  point  x  is  a  local  optimal  solution  of  (1.1)  if  there  exists  a  neighborhood  of  x 
such  that  ifi(x)  <  <p(x)  for  every  feasible  point  x  in  the  neighborhood.  To  ensure 
that  X  satisfies  this  definition,  it  is  convenient  to  verify  certain  optimality  conditions 
that  involve  the  relationship  between  (p  and  the  constraints. 

The  vector  p  is  called  a  direction  of  decrease  for  at  x  if  there  exists  >  0  such 
that  ip{x  +  ap)  <  (p(x)  for  all  0  <  a  <  r^.  Every  suitably  small  positive  step  along  a 
direction  of  decrease  thus  produces  a  strict  reduction  in  ip.  The  nonzero  vector  p  is 
said  to  be  a  feasible  direction  for  the  constraints  of  (1.1)  at  i  if  there  exists  >  0 
such  that  X  +  ap  is  feasible  for  all  0  <  o  <  r^,  i.e.,  if  feasibility  is  retained  for  every 
suitably  small  positive  step  along  p.  If  a  feasible  direction  of  decrease  exists  at  x, 
every  neighborhood  of  x  must  contain  feasible  points  with  a  strictly  lower  value  of 
ip,  and  consequently  x  cannot  be  an  optimal  solution  of  (1.1). 

The  optimality  conditions  for  (1.1)  involve  the  subset  of  constraints  active  or 
binding  (satisfied  exactly)  at  a  possible  solution  x.  (If  a  constraint  is  inactive  at  x, 
it  remains  satisfied  in  every  sufficiently  small  neighborhood  of  x.)  Let  Ig  (“b”  for 
“binding”)  be  the  set  of  indices  of  the  constraints  active  at  the  point  x,  and  let  Ag 
denote  the  matrix  whose  rows  are  the  normals  of  the  active  constraints.  (Both  Jg 
and  Ag  depend  on  x,  but  this  dependence  is  usually  omitted  to  simplify  notation.) 

The  following  conditions  are  necessary  for  the  feasible  point  x  to  be  a  solution 
of  (1.1); 


9{x)  =  Alug  for  some  Pg-, 

Pb  —  0; 

1)  Uv  >  0  for  all  vectors  v  such  that  AgV  —  0. 


(2.1a) 

(2.1b) 

(2.1c) 
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The  necessity  of  these  conditions  is  usually  proved  by  contradiction;  if  all  three  are 
not  satisfied  at  an  alleged  optimal  point  z,  a  feasible  direction  of  decrease  must 
exist,  and  x  cannot  be  optimal. 

The  vector  Hb  in  (2.1a)  is  called  the  vector  of  Lagrange  multipliers  for  the  active 
constraints,  and  is  unique  only  if  the  active  constraints  are  linearly  independent.  Let 
Zb  denote  a  basis  for  the  null  space  of  Ab,  i-e.,  every  vector  t;  satisfying  AbV  =  0 
can  be  written  as  a  linear  combination  of  the  columns  of  Zb-  (Except  in  the  trivial 
c2ise,  Zb  is  not  unique.)  The  vector  Zjg{x)  and  the  matrix  ZgHZg  are  called  the 
reduced  gradient  and  reduced  Hessian  of  <p  (with  respect  to  Ab)-  Condition  (2,1a)  is 
equivalent  to  the  requirement  that  Zgg(x)  =  0,  and  (2.1c)  demands  that  ZgH Zg  be 
positive  semidefinite.  Satisfaction  of  (2.1a)  and  (2.1c)  is  independent  of  the  choice 
of  Zb- 

Various  sufficient  optimality  conditions  for  (1.1)  can  be  stated,  but  the  following 
are  most  useful  for  our  purposes.  The  feasible  point  z  is  a  solution  of  (1.1)  if  there 
exists  a  subset  Ip  of  Is  (“/*”  for  positive  multipliers  and  positive  definite),  with 
corresponding  matrix  ^4^  of  constraint  normals,  such  that 

g(z)  =  Apfip',  (2.2a) 

ftp  >  0;  (2.2b) 

v^Hv  >0  for  all  nonzero  vectors  v  such  that  i4pv  =  0.  (2.2c) 

Condition  (2.2b)  states  that  all  Lagrange  multipliers  associated  with  Ap  are  positive, 
and  (2.2c)  is  equivalent  to  positive-definiteness  of  the  reduced  Hessian  ZpHZp., 
where  Zp  denotes  a  basis  for  the  null  space  of  i4p.  When  the  sufficient  conditions 
hold,  X  is  not  only  optimal,  but  is  also  locally  unique,  i.e.,  <f(x)  <  ip{x)  for  all  feasible 
z  in  a  neighborhood  of  z  (z  z). 

The  gap  between  (2.1)  and  (2.2)  arises  from  the  possibility  of  one  or  more  zero 
Lagrange  multipliers  and/or  a  positive  semidefinite  and  singular  reduced  Hessian. 
When  the  necessary  conditions  are  satisfied  at  some  point  z  but  the  sufficient  con¬ 
ditions  are  not,  a  feasible  direction  of  decrease  may  or  may  not  exist,  so  that  x  is 
not  necessarily  a  local  solution  of  (1.1).  Verification  of  optimality  in  such  instances 
requires  further  information,  and  is  in  general  an  NP-hard  problem  (see  Murty  and 
Kabadi  [MK87],  Pardalos  and  Schnitger  [PS88])  that  is  equivalent  to  the  copositivity 
problem  of  quadratic  programming  (see,  e.g.,  Contesse  (Con80],  Majthay  [Maj71]). 

2.2.  Definition  of  an  iteration 

Given  an  initial  feasible  point  zq,  a  generic  inertiarcontrolling  QP  method  (hereafter 
called  "the  algorithm”)  generates  a  sequence  {xk}  of  approximations  to  the  solution 
of  (1.1)  that  satisfy 

*lt+l  =  OkPki 

where  pt  is  a  nonzero  search  direction  and  ojt  is  a  nonnegative  scalar  steplength.  In 
the  algorithms  of  interest,  pk  is  always  a  direction  of  decrease,  and  Ok  is  chosen  so 
that  z;t.4-i  remains  feasible.  We  usuaUy  consider  a  single  iteration  (the  Jt-th),  and 
use  unsubscripted  symbols  to  denote  quantities  associated  with  iteration  k  when  the 
meaning  is  clear. 
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Let  g  denote  g(x),  the  gradient  of  at  the  current  iterate.  The  following  (stan¬ 
dard)  terminology  is  useful  in  characterizing  the  relationship  between  p  and  (p: 

descent  direction  if  g^p  <  0; 

direction  of  positive  curvature  if  p^ffp  >  0; 

direction  of  negative  curvature  if  p^Hp  <  0; 

direction  of  zero  curvature  if  p^Hp  =  0. 

Because  (p  is  quadratic, 

<p{x  -I-  ap)  =  <p(x)  -I-  ag'^p  -I-  \a^p^Hp.  (2.3) 

This  relation  shows  that  every  direction  of  decrease  p  must  be  either  a  descent 

direction,  or  a  direction  of  negative  curvature  with  g^p  =  0.  If  g^p  <  0  and  p^Hp  > 

0,  we  see  from  (2.3)  that  <^(a;-fap)  <  ip(x)  for  all  0  <  a  <  r,  where  r  =  —2g^plp'^Hp. 
If  g^p  <  0  and  p^Hp  <  0,  or  if  g^p  —  0  and  j^Hp  <  0,  (2.3)  shows  that  is 
monotonically  decreasing  along  p,  i.e.,  <p(i  -f  ap)  <  (p(i)  for  all  a  >  0. 

2,3.  The  role  of  the  working  set 

At  each  iteration,  p  is  defined  in  terms  of  a  subset  of  the  constraints,  designated  as 
the  working  set.  The  “new”  working  set  is  always  obtained  by  modifying  the  “old” 
working  set,  and  the  prescription  for  altering  the  working  set  is  known  for  historical 
reasons  as  the  active-set  strategy. 

Although  it  is  sometimes  useful  to  think  of  the  working  set  as  a  prediction  of  the 
set  of  constraints  active  at  the  solution  of  (1.1),  we  stress  that  this  interpretation 
may  be  misleading.  The  working  set  is  defined  by  the  algorithm,  not  simply  by  the 
active  constraints.  In  particular,  the  working  set  may  not  contain  all  the  active 
constraints  at  any  iterate,  including  the  solution. 

The  matrix  of  normals  of  constraints  in  the  working  set  will  be  called  A.  Let  m 
denote  the  number  of  rows  of  A,  I  the  set  of  indices  of  constraints  in  the  working 
set,  and  b  the  vector  of  corresponding  components  of  /?.  We  refer  to  both  the  index 
set  I  and  the  matrix  A  as  the  working  set.  Let  Z  denote  a  matrix  whose  columns 
form  a  basis  for  the  null  space  of  A;  the  reduced  gradient  and  reduced  Hessian  of  <p 
with  respect  to  A  are  then  Z^g{x)  and  Z^HZ.  We  sometimes  denote  the  reduced 
Hessian  by  Hz-  A  nonzero  vector  v  such  that  Au  =  0  is  called  a  null-space  direction, 
and  can  be  written  as  a  linear  combination  of  the  columns  of  Z. 

In  inertia-controlling  methods,  the  working  set  is  constructed  to  have  three  im¬ 
portant  characteristics: 

WSl.  Constraints  in  the  working  set  are  active  at  z; 

WS2.  The  rows  of  A  are  linearly  independent; 

WS3.  The  working  set  at  zo  is  chosen  so  that  the  initial  reduced  Hessian  is 
positive  definite. 
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Although  each  of  these  properties  has  an  essential  role  in  proving  that  an  inertia¬ 
controlling  algorithm  is  well  defined  (see  Sections  3  and  4),  some  of  them  also  apply 
to  other  active-set  methods. 

We  emphasize  that  it  may  not  be  possible  to  enforce  the  crucial  property  WS3  at 
an  arbitrary  starting  point  xq  if  the  working  set  is  selected  only  from  the  “original” 
constraints — for  example,  suppose  that  II  is  indefinite  and  no  constraints  are  active 
at  Xq.  Inertia-controlling  methods  must  therefore  include  the  ability  to  add  certain 
“temporary”  constraints  to  the  initial  working  set  in  order  to  ensure  that  property 
WS3  holds.  Such  constraints  are  an  algorithmic  device,  and  do  not  alter  the  solution 
(see  Section  4.4). 

This  paper  will  consider  only  active-set  primal- feasible  methods  that  require 
property  WSl  to  apply  at  the  next  iterate  x  ap  with  the  same  working  set  used 
to  define  p.  This  additional  condition  implies  that  the  search  direction  must  be  a 
null-space  direction,  so  that 

Ap  =  0. 

Accordingly,  we  sometimes  use  the  term  null-space  methods  to  describe  the  methods 
of  this  paper. 

A  stationary  point  of  the  ori^nal  QP  (1.1)  with  respect  to  a  particular  working 
set  A  is  any  feasible  point  x  for  which  Ax  =  b  and  the  gradient  of  the  objective 
function  is  a  linear  combination  of  the  columns  of  A^,  i.e., 

g  =  c  +  Hx  =  A^p,  (2.4) 

where  g  =  g{x).  Since  A  has  full  row  rank,  p  is  unique.  For  any  stationary  point,  let 
Pt  (“s”  for  “smallest”)  denote  the  minimum  component  of  p,  i.e.,  p,  =  minpi.  An 
equivalent  statement  of  (2.4)  is  that  the  reduced  gradient  is  zero  at  any  stationary 
point.  The  Karush- Kuhn- Tucker  (KKT)  matrix  K  corresponding  to  A  is  defined 

(2,5) 

When  the  reduced  Hessian  is  nonsingular,  K  is  nonsingular  (see  Corollary  3.1). 

A  stationary  point  at  which  the  reduced  Hessian  is  positive  definite  is  called  a 
minimizer,  and  is  the  unique  solution  of  a  QP  in  which  constraints  in  the  working 
set  appear  as  equalities: 


minimize  c^x  -f  \x^Hx 

x€»"€»"  *  (2.6) 

subject  to  Ax  =  6. 

The  Lagrange  multiplier  vector  for  the  equality  constraints  of  (2.6)  is  the  vectw  p 
of  (2.4).  When  the  reduced  Hessian  is  positive  definite,  the  solution  of  (2.6)  is  x  —  q, 
where  q  solves  the  KKT  system 


K 


(2.7) 
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and  n  is  the  associated  Lagrange  multiplier  vector.  If  x  is  a  stationary  point,  q  =  0. 

Given  an  iterate  x  and  working  set  A,  an  inertia-controlling  method  must  be 
able  to 

•  determine  whether  x  is  a  stationary  point  with  respect  to  A\ 

•  calculate  the  (unique)  Lagrange  multiplier  vector  fi  at  stationary  points  (see 
(2.4)); 

•  determine  whether  the  reduced  Hessian  is  positive  definite,  positive  semidefi- 
nite  and  singular,  or  indefinite. 

In  the  present  theoretical  context,  we  simply  assume  this  ability;  Sections  5-6  discuss 
techniques  for  computing  the  required  quantities. 

To  motivate  active-set  QP  methods,  it  is  enlightening  to  think  in  terms  of  desir¬ 
able  properties  of  the  search  direction.  For  example,  since  p  is  always  a  null-space 
direction  (i.e.,  Ap  =  0),  any  step  along  p  stays  “on”  constraints  in  the  working  set. 
Furthermore,  it  seems  “natural”  to  choose  p  as  a  direction  of  decrease  for  <p  because 
problem  (1.1)  involves  minimizing  <p.  We  therefore  seek  to  obtain  a  null-space  di¬ 
rection  of  decrease  at  every  iteration.  Such  a  direction  can  be  computed  using  the 
current  working  set  in  the  following  two  situations: 

(i)  when  x  is  not  a  stationary  point; 

(ii)  when  x  is  a  stationary  point  and  the  reduced  Hessian  is  indefinite. 

If  neither  (i)  nor  (ii)  applies,  the  algorithm  terminates  or  changes  the  working  set 
(see  Section  2.4). 

When  (i)  holds,  the  nature  of  p  depends  on  the  reduced  Hessian.  (The  specific 
equations  satisfied  by  p  are  given  in  Section  4.1;  only  its  general  properties  are 
summarized  here.)  If  the  reduced  Hessian  is  positive  definite,  p  is  taken  as  -q,  the 
step  to  the  solution  of  the  associated  equality-constrained  subproblem  (see  (2.6)  and 
(2.7)).  This  vector  is  a  descent  direction  of  positive  curvature,  and  has  the  property 
that  a  =  1  is  the  step  to  the  smallest  value  of  along  p.  When  the  reduced 
Hessian  is  positive  semidefinite  and  singular,  p  is  chosen  as  a  descent  direction  of 
zero  curvature.  When  the  reduced  Hessian  is  indefinite,  p  is  taken  as  a  descent 
direction  of  negative  curvature. 

When  (ii)  holds,  i.e.,  when  x  is  a  stationary  point  with  an  indefinite  reduced 
Hessian,  p  is  taken  as  a  direction  of  negative  curvature. 

2.4.  Deleting  constraints  from  the  working  set 

When  X  is  a  stationary  point  at  which  the  reduced  Hessian  is  positive  semidefinite, 
it  is  impossible  to  reduce  by  moving  along  a  null-space  direction.  Depending  on 
the  sign  of  the  smallest  Lagrange  multiplier  and  the  nature  of  the  reduced  Hessian, 
the  algorithm  must  either  terminate  or  change  the  working  set  by  deleting  one  or 
more  constraints. 
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Let  X  be  any  stationary  point  (so  that  g  =  A^n),  and  suppose  that  //,  <  0  for 
constraint  s  in  the  working  set.  Let  Cj  be  the  s-th  coordinate  vector.  Given  a  vector 
p  satisfying 

Ap  =  'ye,  (7  >  0), 

a  positive  step  along  p  moves  “off”  (strictly  feasible  to)  constraint  s,  but  remains 
“on”  the  other  constraints  in  A.  (The  full  rank  of  the  working  set  guarantees  that 
the  equations  Ap  =  v  are  compatible  for  any  vector  v.)  It  foUov/s  that 

g^p  =  p^Ap  =  'yp^e,  =  -yp,  <  0, 

so  that  p  is  a  descent  direction.  A  negative  multiplier  for  constraint  s  thus  suggests 
that  a  null-space  descent  direction  can  be  found  by  deleting  constraint  s  from  the 
working  set.  However,  our  freedom  to  delete  constraints  is  limited  by  the  inertia¬ 
controlling  strategy.  To  ensure  that  the  reduced  Hessian  has  no  more  than  one 
nonpositive  eigenvalue,  a  constraint  can  be  deleted  only  at  a  minimizer.  (Section  3.3 
provides  theoretical  validation  of  this  policy.) 

When  I  is  a  minimizer,  the  action  of  the  algorithm  depends  on  the  sign  of 
p,.  If  p,  >  0,  the  sufficient  conditions  (2.2)  for  optimality  apply  with  Ip  =  I, 
and  the  algorithm  terminates.  If  p,  <  0,  constraint  s  is  deleted  from  the  working 
set,  thereby  creating  at  most  one  nonpositive  eigenvalue  in  the  reduced  Hessian. 
There  are  two  cases  to  consider.  If  p,  <  0  and  constraint  s  is  removed  from  the 
working  set,  x  cannot  be  a  stationary  point  with  respect  to  the  “new”  working 
set.  On  the  other  hand,  if  p,  ~  0,  the  uniqueness  of  p  implies  not  only  that  i 
stays  a  stationary  point  after  removal  of  constraint  s,  but  also  that  the  multipliers 
corresponding  to  the  remaining  constraints  are  unaltered.  The  algorithm  therefore 
continues  to  delete  constraints  with  zero  multipliers  until  either  a  working  set  is 
found  for  which  p,  >  0  or  the  reduced  Hessian  ceases  to  be  positive  definite.  If  the 
reduced  Hessian  is  positive  definite  after  all  constraints  with  zero  multipliers  have 
been  deleted,  x  satisfies  the  sufficient  optimality  conditions  (2.2)  and  the  algorithm 
terminates.  Once  the  reduced  Hessian  has  ceased  to  be  positive  definite,  no  further 
constraints  are  deleted. 

An  inertia-controlling  algorithm  cannot  reach  a  stationary  point  with  a  positive 
semidefinite  and  singular  reduced  Hessian  by  adding  a  constraint  (see  Lemma  4.5). 
Such  a  point  can  be  reached  only  by  deleting  a  constraint  with  a  zero  multiplier; 
the  smallest  multiplier  associated  with  the  working  set  after  deletion  must  be  non- 
negative,  and  the  algorithm  terminates.  In  this  case,  the  necessary  conditions  (2.1) 
are  satisfied,  but  x  may  not  be  optimal  for  the  original  problem  (1.1),  as  discussed 
at  the  end  of  Section  2.1. 

The  pseudo-code  in  Figure  1  summarizes  the  constraint  deletion  procedure  per¬ 
formed  at  the  beginning  of  each  iteration.  The  logical  variables  positive-definite, 
posit ivesemide finite,  singular  and  indefinite  are  assumed  to  be  recomputed  after 
each  constraint  is  deleted;  the  loffcal  variable  complete  is  used  to  terminate  the 
overall  algorithm  (see  Figure  3).  The  details  of  the  boxed  computation  (deleting  a 
constrsunt  from  the  working  set)  depend  on  the  particular  inertia-controlling  algo¬ 
rithm  (see  Section  5.1).  It  is  important  to  notice  that  more  than  one  working  set 
can  be  associated  with  a  given  iterate  x. 
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vjorking^seLchosen  <—  false; 
repeat  until  workingseLchosen 

if  not  stationary-point  or  indefinite  then  workingseLchosen  <—  true 
else 

p,  *-  minj  pj ; 

if  (positivesemidefinite  and  singular)  or  /i,  >  0  then 
complete  *-  true; 
workingseLchosen  *—  true; 

else 

if  /i«  <  0  then  stationary-point  *-  false  end  if 
delete  constraint  s  from  the  working  set; 

end  if 
end  if 
end  repeat 


Figure  1:  Pseudo-code  for  constraint  deletion. 


2.5.  Adding  constraints  to  the  working  set 


Conceptually,  constraints  are  deleted  from  the  working  set  before  computing  p,  and 
are  added  to  the  working  set  after  computing  the  steplength  a.  Since  p  is  always 
a  direction  of  decrease,  the  goal  of  minimizing  (p  suggests  that  a  should  be  taken 
as  the  step  along  p  that  produces  the  largest  decrease  in  (p.  Furthermore,  i  -f-  op  is 
automatically  feasible  with  respect  to  constraints  in  the  working  set  because  p  is  a 
null-space  direction.  However,  a  may  need  to  be  restricted  so  that  the  new  iterate 
remains  feasible  with  respect  to  constraints  not  in  the  working  set.  A  constraint  that 
is  active  at  x  but  is  not  in  the  working  set  is  called  idle\  for  example,  a  constraint 
that  has  just  been  deleted  from  the  working  set  is  idle. 

Let  i  be  the  index  of  a  constrmnt  not  in  the  working  set.  The  constraint  will 
not  be  violated  at  x  -f-  ap  for  any  positive  step  a  if  ajp  >  0.  If  ajp  <  0,  however, 
the  constraint  will  become  active  at  a  certain  nonnegative  step.  For  every  i  ^  2,  a, 
is  defined  as 


{fii  -  a'[x)fafp 
+  00 


if  ajp  <  0; 
otherwise. 


(2.8) 


The  maximum  feasible  step  Of  (often  called  the  step  to  the  nearest  constraint)  is 
defined  as  =  minoj.  The  value  of  Op  is  zero  if  and  only  if  ajp  <  0  for  at  least 
one  idle  constraint  i.  If  Op  is  infinite,  the  constraints  do  not  restrict  positive  steps 
along  p. 

In  order  to  retain  feasibility,  a  must  satisfy  o  <  Op.  If  the  reduced  Hessian  is 
positive  definite,  the  step  of  unity  along  p  has  special  significance,  since  p  in  this 
case  is  taken  as  -q  of  (2.7),  and  cp  achieves  its  minimum  value  along  p  at  a  =  1 
(see  (2.6)).  When  the  reduced  Hessian  is  either  indefinite  or  positive  semidefinite 
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and  singular,  is  monotonically  decreasing  along  p  (see  Section  2.2).  Hence,  the 
nonnegative  step  a  along  p  that  produces  the  maximum  reduction  in  (p  and  retains 
feasibility  is 

{min(l,Qr)  if  Z^IIZ  is  positive  definite; 

ap  otherwise. 

In  order  for  the  algorithm  to  proceed,  a  must  be  finite.  If  a  =  (X),  p)  is  unbounded 
below  in  the  feasible  region,  (1.1)  has  an  infinite  solution,  and  the  algorithm  termi¬ 
nates. 

Let  r  denote  the  index  of  a  constraint  for  which  =  Or-  I'he  algorithm  requires 
a  single  value  of  r,  so  that  some  rule  is  necessary  in  case  of  ties — for  example,  r  may 
be  chosen  to  improve  the  estimated  condition  of  the  working  set.  (Several  topics 
related  to  this  choice  are  discussed  in  Gill  ct  al.  [GMSW].)  When  a  =  O/r,  the 
constraint  with  index  r  becomes  active  at  the  new  iterate.  In  the  inertia-controlling 
methods  to  be  considered,  Or  is  added  to  the  working  set  at  this  stage  of  the  iteration, 
with  one  exception:  a  constraint  is  not  added  when  the  reduced  Hessian  is  positive 
definite  and  o,-  =  1.  In  this  case,  x  p  is  automatically  a  minimizer  with  respect 
to  the  current  working  set,  which  means  that  at  least  one  constraint  will  be  deleted 
at  the  beginning  of  the  next  iteration  (see  Section  2.4). 

Assuming  the  availability  of  a  suitable  direction  of  decrease  p,  the  pseudo-code 
in  Figure  2  summarizes  the  constraint  addition  procedure.  As  in  Figure  1,  details 
of  the  boxed  computation  (adding  a  constraint  to  the  working  set)  depend  on  the 
particular  inertia-controlling  algorithm  (see,  e.g.,  Sections  6.1  and  6.2). 

Qf  *—  maximum  feasible  step  along  p  (to  constraint  r); 
hit-constraint  <—  not  positive-definite  or  Of  <  1; 
if  hiLconstraint  then  a  <—  else  a  ♦—  1  end  if; 
if  a  =  oo  then  stop 
else 

I  «—  I  -I-  ap; 
if  hiLconstraint  then 

add  constraint  r  to  the  working  set 

end  if 
end  if 

Figure  2;  Pseudocode  for  constraint  addition. 

The  following  lemma  shows  that  all  working  sets  have  full  rank  in  a  null-space 
inertia-controlling  method. 

Lemma  2.1.  Assume  that  the  initial  working  set  has  full  mnk.  For  the  active-set 
QP  algorithm  just  described,  any  constraint  added  to  the  working  set  must  be  linearly 
independent  of  the  constraints  in  the  working  set. 
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Proof.  A  constraint  a,  can  be  added  to  the  working  set  A  only  if  ajp  <  0  (see 
(2.8)).  If  Or  were  linearly  dependent  on  the  working  set,  we  could  express  Or  as 
Or  =  A^r  for  some  nonzero  vector  r.  However,  p  is  a  null-space  direction,  and  the 
relation  Ap  =  0  would  then  imply  that  ajp  =  r^Ap  =  0,  a  contradiction.  | 

Putting  together  the  deletion  and  addition  strategies,  Figure  3  summarizes  the 
general  structure  of  the  inner  loop  of  an  inertia-controlling  QP  method.  The  logical 
variable  complete  indicates  whether  the  method  has  terminated. 


complete  «—  false; 
repeat  until  complete 

execute  constraint  deletion  procedure  (Figure  1); 
if  not  complete  then 

compute  p; 

execute  constraint  addition  procedure  (Figure  2); 
end  if 
end  repeat 


Figure  3:  Structure  of  iteration  loop. 


3.  Theoretical  Background 

This  section  summarizes  theoretical  results  used  in  proving  that  inertia-controlling 
methods  are  well  defined. 

3.1.  The  Schur  complement 
Given  the  partitioned  symmetric  matrix 


where  M  is  nonsingular,  the  Schur  complement  of  M  in  T  is  denoted  by  T/M,  and 
is  defined  as 

T/M  =  G-  (3.2) 

We  sometimes  refer  simply  to  “the”  Schur  complement  when  the  relevant  matrices 
are  evident. 

An  important  application  of  the  Schur  complement  is  in  solving  Ty  ~  d  when  T 
has  the  form  (3.1)  and  is  nonsingular.  Let  the  right-hand  side  d  and  the  unknown 
y  be  partitioned  to  conform  with  (3.1): 
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Then  y  may  be  obtained  by  solving  (in  order) 

Mw  =  di  (3.3a) 

(T/Af)y2  =  d2-Ww  (3.3b) 

A/y,  =  d,  -  W'^y2.  (3.3c) 

Let  T  be  any  symmetric  matrix.  We  denote  by  Zp(T’),  i„(T)  and  t2(T)  respec¬ 
tively  the  (nonnegative)  numbers  of  positive,  negative  and  zero  eigenvalues  of  T. 
The  inertia  of  T — denoted  by  In  (T) — is  the  associated  integer  triple  {ip,in,iz)-  For 
any  suitably  dimensioned  nonsingular  matrix  S,  Sylvester’s  law  of  inertia  states  that 

In  (T)  =  In  (S'^TS).  (3.4) 

An  important  relationship  holds  among  the  inertias  of  T,  M  and  the  Schur  comple¬ 
ment  (3.2): 

In  (T)  =  In  (M)  +  In  (T/M)  (3.5) 

(see  Ilaynsworth  [Hay68]). 

An  analogous  Schur  complement  can  be  defined  for  a  nonsymmetric  matrix  T. 
When  M  is  singular,  the  generalized  Schur  complement  is  obtained  by  substituting 
the  generalized  inverse  of  M  for  A/"*  in  (3.2),  and  by  appropriate  adjustment  of 
(3.3).  The  “classical”  Schur  complement  (3.2)  and  its  properties  are  discussed  in  de¬ 
tail  by  Cottle  [Cot74].  For  further  details  on  the  generalized  Schur  complement,  see 
Carlson,  Haynsworth  and  Markham  [CHM74]  and  Ando  [And74].  Carlson  [Car86] 
gives  an  interesting  survey  of  results  on  both  classical  and  generalized  Schur  com¬ 
plements,  along  with  an  extensive  bibliography. 

3.2.  The  KKT  matrix  and  the  reduced  Hessian 

The  eigenvalue  structure  of  the  reduced  Hessian  determines  the  logic  of  an  inertia- 
controlling  method,  and  the  KKT  matrix  of  (2.5)  plays  a  centra]  role  in  defining  the 
search  direction.  The  following  theorem  ^ves  an  important  relationship  between 
the  KKT  matrix  and  the  reduced  Hessian  Z^HZ. 

Theorem  3.1.  Let  H  be  an  n  x  n  symmetric  matrix,  A  an  m  x  n  matrix  of  full 
row  rank,  K  the  KKT  matrix  of  (2.5),  and  Z  a  null-space  basis  for  A.  Then 

(  H  A^  \ 

IniK)=In{  \  =z  In{Z^HZ)  +  {m,m,0). 

Proof.  See  Gould  [Gou85].  Since  every  basis  for  the  null  space  may  be  written 
as  ZS  for  some  nonsingular  matrix  5,  Sylvester’s  law  of  inertia  (3.4)  implies  that 
the  inertia  of  the  reduced  Hessian  is  independent  of  the  particular  choice  of  Z.  We 
emphasize  that  the  full  rank  of  A  is  essential  in  this  result.  | 

Corollary  3.1.  The  KKT  matrix  K  is  nonsingular  if  and  only  if  the  reduced  Hes¬ 
sian  Z^HZ  is  nonsingular.  | 
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3.3.  Changes  in  the  working  set 

Tbe  nature  of  the  KKT  matrix  leads  to  several  results  concerning  the  eigenvalue 
structure  of  the  reduced  Hessian  following  a  change  in  the  working  set. 

Lemma  3.1.  Let  M  and  A/+  denote  symmetric  matrices  of  the  following  form: 

where  is  B  with  one  additional  row.  (The  subscript  is  intended  to  emphasize 
which  matrix  has  the  extra  row.)  Then  exactly  one  of  the  following  cases  holds: 

(a)  ip(A/+)  =  ip(A/)  +  1,  i„(A/+)  =  t„(M)  and  t,(A/+)  =  i,(A/); 

(b)  ip(M+)  =  ip(M)  +  1,  i„(M+)  =  i„(M)  +  1  and  i,(M+)  =  i*(M)  -  1; 

(c)  ip(MJ  =  ip(M),  i„(M+)  =  i„(M)  +  1  and  i,(M+)  =  i.(M); 

(d)  ip(A/J  =  ip(A/),  i„(A/+)  =  t„(M)  and  i,(A/+)  =  i,(M)  +  1. 

Proof.  It  is  sufficient  to  prove  the  result  for  the  case  when 

fl.  =  (  )  ,  (3.6) 

where  6^  is  a  suitably  dimensioned  row  vector.  If  the  additional  row  of  occurs 
in  any  position  other  than  the  last,  there  exists  a  permutation  77  (representing  a 
row  interchange)  such  that  77 has  the  form  (3.6).  Let 

I  H  6  ^ 

,  which  gives  PM^.P^  =  B  (3.7) 

Because  P  is  orthogonal,  PM^.P^  is  a  similarity  transform  of  A/+ ,  and  has  the  same 
eigenvalues  (see  Wilkinson  [Wil65],  page  7).  Thus  the  lemma  applies  equally  to  M+ 
and  PM^P^. 

When  B^  has  the  form  (3.6),  standard  theory  on  the  interlacing  properties  of 
the  eigenvalues  of  bordered  symmetric  matrices  states  that 

>  Aj  >  •••  >  A,  >  a;^,, 

where  I  is  the  dimension  of  M,  and  {A^}  and  {A+  )  are  the  eigenvalues  of  M  and  M+ 
respectively,  in  decreasing  order  (see,  e.g.,  Wilkinson  [Wil65],  pages  96-97).  The 
desired  results  follow  by  analyzing  the  consequences  of  these  inequalities.  | 

By  combining  the  general  interlacing  result  of  Lemma  3.1  with  the  specific  prop¬ 
erties  of  the  KKT  matrix  from  Theorem  3.1,  we  derive  the  following  lemma,  which 
applies  to  either  adding  or  deleting  a  single  constraint  from  the  working  set. 
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Lemma  3.2.  Let  A  be  an  m  x  n  matrix  of  full  row  rank,  and  let  denote  A 
with  one  additional  linearly  independent  row  (so  that  A+  also  has  full  row  rank). 
The  matrices  Z  and  Z^.  denote  null-space  bases  for  A  and  A^.,  and  Hz  ond  Hz^ 
denote  the  associated  reduced  Hessian  matrices  Z^HZ  and  ZJHZ^.  (Note  that  the 
dimension  of  Hz^  is  one  less  than  the  dimension  of  Hz-)  Define  K  and  A%  as 

A'=("  =  <), 

where  H  is  an  n  X  n  symmetric  matrix.  Then  exactly  one  of  the  following  cases 
holds: 

(a)  ipiHz^)  =  ip{Hz)  -  1,  in{Hz^)  =  in(Hz)  -  1  and  iz{Hz^)  =  «*(///)  +  1/ 

(b)  ip{Hz^)  =  ip{Hz)  -  I,  in{HzJ  =  in(Hz)  and  i,{Hz^)  =  UHz); 

(c)  ip{Hz^)  =  ip{Hz),  =  iniHz)  -  1  and  it(Hz^)  =  iz(Hz); 

(d)  ip(Hz+)  =  ip(Hz),  in(Hz^)  =  iniHz)  and  iz{Hz^)  =  -  1- 

Proof.  Since  A  and  A.^.  have  full  row  rank,  Theorem  3.1  applies  to  both  K  and 
/(■+,  and  gives  *p(/i )  =  ipiHz)  +  m,  ipiK^.)  =  ip{Hz^)  +  m  +  1,  »n(/f)  >  m  and 
in(A'+)  >  m  +  1.  Substituting  from  these  relations  into  the  four  cases  of  Lemma  3.1, 
we  obtain  the  desired  results.  | 

When  a  constraint  is  added  to  the  working  set,  A  and  A^.  correspond  to  the 
“old”  and  “new"  working  sets.  Lemma  3.2  shows  that  adding  a  constraint  to  the 
working  set  either  leaves  unchanged  the  number  of  nonpositive  eigenvalues  of  the 
reduced  Hessian,  or  decreases  the  number  of  nonpositive  eigenvalues  by  one.  The 
following  corollary  lists  the  possible  outcomes  of  adding  a  constraint  to  the  working 
set. 

Corollary  3.2.  Under  the  same  assumptions  as  in  Lemma  3.2: 

(a)  if  Z'^HZ  is  positive  definite  and  a  constraint  is  added  to  the  working  set, 
ZJHZ^,  must  be  positive  definite; 

(b)  if  Z^HZ  is  positive  semidefinite  and  singular  and  a  constraint  is  added  to 
the  working  set,  ZjHZ^  may  be  positive  definite  or  positive  semidefinite  and 
singular; 

(c)  if  Z'^HZ  is  indefinite  and  a  constraint  is  added  to  the  working  set,  Z'^HZ^ 
may  be  positive  definite,  positive  semidefinite  and  singular,  or  indefinite.  | 


For  a  constraint  deletion,  on  the  other  hand,  the  roles  of  A  and  >4+  are  reversed 
{K^  is  the  “old”  KKT  matrix  and  K  is  the  “new”).  In  this  case.  Lemma  3.2  shows 
that  deleting  a  constraint  from  the  working  set  can  either  leave  unchanged  the 
number  of  nonpositive  eigenvalues  of  Z^HZ,  or  increase  the  number  of  nonpositive 
eigenvalues  by  one. 


3.  Theoretical  Background 
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If  constraints  are  deleted  only  when  the  reduced  Hessian  is  positive  definite, 
Lemma  3.2  validates  the  inertia-controlling  strategy  by  ensuring  that  the  reduced 
Hessian  will  never  have  more  than  one  nonpositive  eigenvalue  following  a  deletion 
and  any  number  of  additions.  Accordingly,  when  the  reduced  Hessian  matrix  is 
hereafter  described  as  “indefinite”,  it  has  a  single  negative  eigenvalue,  with  all  other 
eigenvalues  positive;  and  when  the  reduced  Hessian  matrix  is  described  as  “singular”, 
it  has  one  zero  eigenvalue,  with  all  other  eigenvalues  positive. 

3.4.  Relations  involving  the  KKT  matrix 

We  now  prove  several  results  that  will  be  used  in  Section  4.  It  should  be  emphasized 
that  the  following  lemma  makes  no  assumption  about  the  nonsingularity  of  K. 

Lemma  3.3.  Let  A  and  A.^.  be  matrices  with  linearly  independent  rows,  where  A^  is 
A  with  a  row  added  in  position  s.  Let  K,  Z,  K^.  and  Z+  be  defined  as  in  Lemma  3.2. 
If  is  nonsingular,  then 

In(l{)  +  (1,1,0)  =  /n(K^)-h  In(-a), 

where  a  is  the  (n  -f  s)-th  diagonal  element  of  \  i.e.,  a  = 

®n+*' 

Proof.  Consider  the  matrix 


where  is  the  (n  -f  s)-th  coordinate  vector.  Using  definition  (3.2)  of  the  Schur 
complement, 

A'aug//f+  =  -<T. 

Since  is  nonsingular,  relation  (3.5)  applies  to  /fau*.  and  we  have 

In  (A'aug)  =  In  (A'+)  -f-  In  (-cr).  (3.8) 

Because  of  the  special  forms  of  K  and  K^,  it  is  possible  to  obtain  an  expression 
that  relates  the  inertias  of  K  and  Kaa%-  Let  the  new  row  of  A+  be  row  s,  and 
denote  the  corresponding  n-vector  by  a,.  As  in  (3.7),  a  permutation  matrix  P  can 
be  symmetrically  applied  to  A'aug  so  that  row  s  becomes  the  last  row  in  the  upper 
left  square  block  of  size  n  -f  m  +  1.  Farther  permutations  lead  to  the  following 
symmetrically  reordered  version  of  A'aug: 

0  la,  o' 

10  0  0 

aj  0  H  A'^  ’ 

0  0  A 
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where  P  is  a  permutation  matrix.  Since  is  a  symmetric  permutation  of 
the  two  matrices  have  the  same  eigenvalues,  and  hence 

IniK^g)  =  IniK^^).  (3.9) 

The  2x2  matrix  in  tlie  upper  left-hand  corner  of  K^ug  (denoted  by  E)  is  non¬ 
singular,  with  eigenvalues  ±1,  and  satisfies 

/n(P)  =  (1,1,0)  with  E-^  =  E=^^  J 

Using  (3.2),  it  can  be  verified  algebraically  that  the  Schur  complement  of  E  in  K...g 
is  simply  K: 


Applying  (3.5)  to  A'au*  and  using  (3.9),  we  obtain 

In  (K^g)  =  In  (£)  +  In  (K^/E)  =  (1, 1,0)  +  In  (A'). 
Combining  (3.8)  and  (3.10)  gives  the  desired  result.  | 


(3.10) 


Corollary  3.3.  Let  K  and  A'+  6e  defined  as  in  Lemma  S.S.  Consider  the  nonsin¬ 
gular  linear  system 


(3.11) 


where  y  has  n  components.  Let  w,  denote  the  s-th  component  of  w.  (Since  the 
solution  of  (3.11)  is  column  n  -f  s  of  w,  =  a  of  Lemma  S.S.)  Then: 


(a)  if  Z^HZ  is  positive  definite  and  Zf^RZ^  is  positive  definite,  w,  must  be  neg¬ 
ative; 

(b)  if  Z^HZ  is  singular  and  ZJHZ^  is  positive  definite,  w,  must  be  zero; 

(c)  if  Z^HZ  is  indefinite  and  ZjHZ^  is  positive  definite,  w,  must  be  positive.  | 


Lemma  3.4.  Let  K  and  be  defined  as  in  Lemma  S.S,  with  the  further  assump¬ 
tions  that  ZJHZ^,  is  positive  definite  and  Z^HZ  is  indefinite.  Let  z  denote  the  first 
n  components  of  the  solution  of 


where  af  is  the  additional  row  of  A^..  Then  afz  <  0. 
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Proof.  Because  Z^HZ  is  indefinite,  K  is  nonsingular  (see  Theorem  3.1).  The 
vectors  z  and  t  of  (3.12)  are  therefore  unique,  and  satisfy 

r  +  -  a,  =  0,  Az  =  {i.  (3.13) 

We  now  relate  the  solutions  of  (3.12)  and  (3.11).  Because  of  the  special  structure 
of  K^,  the  unique  solution  of  (3.11)  satisfies 

Hy  +  +  OtW,  =  0,  Ay  =  0,  ajy  =  1,  (3. 14) 

where  Wj^  denotes  the  subvector  of  w  corresponding  to  A,  and  w,  is  the  component 
of  w  corresponding  to  a,.  Corollary  3.3  implies  that  >  0.  Comparing  (3.14) 
and  (3.13),  we  conclude  that  y  =  WfZ.  Since  ajy  =  1,  this  relation  implies  that 
ajz  =  -!/«;*  <  0,  which  is  the  desired  result.  | 

4.  Theoretical  Properties  of  Inertia-Controlling  Methods 

In  this  section  we  give  the  equations  used  to  define  the  search  direction  after  the 
working  set  has  been  chosen  (see  Section  2.4),  and  then  prove  various  properties  of 
inertia-controlling  methods.  When  the  reduced  Hessian  is  positive  definite,  choosing 
p  as  -g  from  the  KKT  system  (2.7)  means  that  a  =  1  (the  step  to  the  minimizer  of 
ip  along  p)  can  be  viewed  as  the  “natural”  step.  In  contrast,  if  the  reduced  Hessian 
is  singular  or  indefinite,  the  search  direction  needs  to  be  specified  only  to  within 
a  positive  multiple.  Since  p  is  monotonically  decreasing  along  p  when  the  reduced 
Hessian  is  not  positive  definite,  the  steplength  o  is  determined  not  by  p,  but  by  the 
nearest  constraint  (see  Section  2.5).  Hence,  multiplying  p  by  any  positive  number  7 
simply  divides  the  steplength  by  7,  and  produces  the  identical  next  iterate. 

4.1.  Definition  of  the  search  direction 

The  mathematical  specification  of  the  search  direction  depends  on  the  eigenvalue 
structure  of  the  reduced  Hessian,  and,  in  the  indefinite  case,  on  the  nature  of  the 
current  iteration. 

Positive  definite.  If  the  reduced  Hessian  is  positive  definite,  the  search  direction 
p  is  taken  as  p  =  -q,  where  q  is  part  of  the  solution  of  the  KKT  system  (2.7): 

{;“’)(;)•(:)■ 

An  equivalent  definition  of  p,  which  will  be  relevant  in  Sections  6.1  and  6.2, 
involves  the  null-space  equations: 

p=Zpz,  where  Z^HZpz  =  -Z^g. 

Singular.  If  the  reduced  Hessian  is  singular  and  the  algorithm  does  not  terminate, 
we  shall  show  later  that  x  cannot  be  a  stationary  point  (see  Lemma  4.5).  The 
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search  direction  p  is  defined  as  /3p,  where  p  is  the  unique  nonzero  direction 
satisfying 


(4.2) 


and  /3  is  chosen  to  make  p  a  descent  direction.  Equivalently,  p  is  defined  by 

P  =  Zpz,  where  Z^HZpz  =  0,  ||pz||  ^  0. 

This  vector  pz  is  a  multiple  of  the  single  eigenvector  corresponding  to  the  zero 
eigenvalue  of  Z^HZ. 

Indefinite.  If  the  reduced  Hessian  is  indefinite,  it  must  be  nonsingular,  with  exactly 
one  negative  eigenvalue.  In  this  case,  p  is  defined  in  two  different  ways. 

First,  if  the  current  working  set  was  obtained  either  by  deleting  a  constraint 
with  a  negative  multiplier  from  the  immediately  preceding  working  set,  or  by 
adding  a  constraint,  then  p  is  taken  as  q  from  the  KKT  system  (2.7),  i.e.,  p 
satisfies 


(4.3) 


Second,  if  the  current  working  set  is  the  result  of  deleting  a  constraint  with 
a  zero  multiplier  from  the  immediately  preceding  working  set,  let  a,  denote 
the  normal  of  the  deleted  constraint.  The  current  point  is  still  a  stationary 
point  with  respect  to  A  (see  Section  2.4),  and  hence  g  =  A^fi  for  some  vector 
ft.  The  search  direction  p  is  defined  by 


/  H  A'^ 


\ 


A 


\ 

f  M 

f9\ 

= 

0 

1  } 

ll/ 

(4.4) 


which  can  also  be  written  as 


I  H  a,  \ 

^  P  ^ 

w 

zz 

0 

V  / 

1  } 

1 1 

(4.5) 


where  w  =  v  —  p.  The  KKT  matrix  including  a,  must  have  been  nonsingular 
to  allow  a  constraint  deletion,  so  that  the  solution  of  either  (4.4)  or  (4.5)  is 
unique,  and  Corollary  3.3  implies  that  w,  >  0. 


4.2.  Intermediate  iterations 

Various  properties  of  inertiarcontrolling  methods  have  been  proved  by  Fletcher  and 
others  (see,  e.g.,  [Fle71,Fle81,GM78,Goa86]).  In  this  section,  we  use  the  Schur- 
complement  results  of  Section  3  to  analyze  certain  sequences  of  iterates  in  an  inertia¬ 
controlling  method.  The  initial  pnnt  xq  is  assumed  to  be  feasible;  the  initial  working 


_ 4.  Theoretical  Properties  of  Inertia-Controlling  Methods _ iS 

set  has  full  row  rank  and  is  chosen  so  that  the  reduced  Hessisui  is  positive  definite 
(see  Section  4.4). 

A  recurring  difficulty  in  describing  inertia-controlling  methods  is  that  one  cannot 
always  refer  without  ambiguity  to  “the”  working  set  associated  with  an  iterate.  The 
following  terminology  is  intended  to  characterize  the  relationship  between  an  iterate 
and  a  working  set.  Let  x  be  an  iterate  of  an  inertia-controlling  method  and  A  a 
valid  working  set  for  x,  so  that  the  rows  of  A  are  linearly  independent  normals  of 
constraints  active  at  x.  As  usual,  Z  denotes  a  null-space  basis  for  A.  We  say  that 

standard  if  Z^H  Z  is  positive  definite; 

nonstandard  if  Z'^HZ  is  not  positive  definite; 

a  minimizer  if  Z^g  =  0  and  Z^H Z  is  positive  definite; 

intermediate  if  x  is  not  a  minimizer. 

In  each  case,  the  term  requires  a  specification  of  A,  which  is  omitted  only  when  its 
meaning  is  obvious.  We  stress  that  the  same  point  can  be,  for  example,  a  minimizer 
with  respect  to  one  working  set  A,  but  intermediate  with  respect  to  another  (usually, 
A  with  one  or  more  constrmnts  deleted). 

We  now  examine  the  properties  of  intermediate  iterates  that  occur  after  a  con¬ 
straint  is  deleted  at  one  minimizer,  but  before  the  next  minimizer  is  reached.  Each 
such  iterate  is  associated  with  a  unique  most  recently  deleted  constraint.  Consider  a 
sequence  of  consecutive  intermediate  iterates  {xfc},  I:  =  0, . . .  ,iV,  with  the  following 
three  features: 

11.  Xfc  is  intermediate  with  respect  to  the  working  set  Ajt; 

12.  Ao  is  obtained  by  deleting  the  constraint  with  normal  a.  from  the  working 
set  A«,  so  that  xq  is  a  minimizer  with  respect  to  A.; 

13.  Xfc,  1  <  A:  <  iV,  is  not  a  minimizer  with  respect  to  any  valid  working  set. 

At  xis,  Pk  is  defined  using  A^  as  A  (and,  if  necessary,  a,  as  a,)  in  (4.1),  (4.2),  (4.3) 
or  (4.4).  (Note  that  (4.4)  may  be  used  only  at  xq.) 

Let  Zm  denote  a  basis  for  the  null  space  of  A*.  For  purposes  of  this  discussion, 
the  position  of  aj  in  A,  is  irrelevant,  and  hence  we  assume  that  A,  has  the  form 

Because  of  the  inertia-controlling  strategy,  the  reduced  Hessian  ZjHZ^  must  be 
positive  definite.  Relation  (4.6)  implies  that 

p^Hp  >  0  for  any  nonzero  p  such  that  A^p  =  0  and  aj^p  =  0.  (4.7) 

If  the  iterate  following  Xk  is  intermediate  and  the  algorithm  continues,  Ok  is  the 
step  to  the  nearest  constraint,  and  a  constraint  is  added  to  the  working  set  at  each 
A;  >  1.  If  a  constraint  is  added  and  Xk  is  standard,  it  must  hold  that  Ok  <  1. 
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(Otherwise,  if  a*  =  1,  +  Pik  is  a  minimizer  with  respect  to  A*,  and  the  sequence 

of  intermediate  iterates  ends.)  Let  ajk  denote  the  normal  of  the  constraint  added  to 
Ale  at  Xfc+i  to  produce  Ajt+i,  so  that  the  form  of  Ajt+i  is 


/  A.\ 


\4/ 


We  now  prove  several  lemmas  leading  to  the  result  that  the  gradient  at  each 
intermediate  iterate  Xk  may  be  expressed  as  a  linear  combination  of  Ak  and  a,.  For 
simplicity,  whenever  possible  we  adopt  the  convention  that  unbarred  and  barred 
quantities  are  associated  with  intermediate  iterates  k  and  A:  +  1  respectively. 

Lemma  4.1.  Let  g  and  A  denote  the  gradient  and  working  set  at  an  intermediate 
iterate  x  where  p  is  defined  by  (4-I)-(4-S),  and  a,  is  the  most  recently  deleted  con¬ 
straint.  Let  X  =  X  +  ap,  and  assume  that  constraint  a  is  added  to  A  at  x,  giving  the 
working  set  A.  If  there  exist  a  vector  v  and  a  scalar  v,  such  that 

g  =  A^v  —  with  v,  >  0,  (4.9) 

then 

(a)  g,  the  gradient  at  x,  is  also  a  linear  combination  of  A^  and  a.; 

(b)  there  exist  a  vector  v  and  scalar  such  that 

g  =  ATv  -  €*a,,  with  u,  >  0.  (4.10) 

Proof.  Because  (p  is  quadratic, 

g{x  +  oip)  =  g  +  aHp.  (4.11) 

We  now  consider  the  form  of  g  for  the  three  possible  definitions  of  p. 

When  the  reduced  Hessian  is  positive  definite,  p  satisfies  g  +  Hp  =  so  that 
Hp  =  -g  4-  A^p.  Substituting  from  this  expression  and  (4.9)  in  (4.11),  we  obtain 
(a)  from 

g  =  g  +  aHp  =  (1  -  Q)g  +  oA^p  =  A^A  -  t;,o„ 

where  A  =  (l-at)v  +  ap  and  u,  =  (l-a)i;,.  Since  o  <  1,  (b)  is  obtained  by  forming 
V  from  A  and  a  zero  component  corresponding  to  row  in  A. 

When  the  reduced  Hessian  is  singular,  p  is  defined  as  fip,  where  fi  ^  0  and  p 
satisfies  (4.2),  so  that  Hp  =  -fiA’^v.  Substituting  from  this  relation  and  (4.9)  in 
(4.11)  gives 

g  =  g4-  aHp  =  g-  afiA^v  =  A^(u  -  afiv)  -  »,o„ 

and  (4.10)  holds  with  v»  =  and  v  formed  by  augmenting  A  =  v  —  afiv  with  a  zero 
component  as  above. 


J^.  Theoretical  Properties  of  Inertia-Controlling  Methods 


21 


Finally,  when  the  reduced  Hessian  is  indefinite  and  the  search  direction  is  defined 
by  (4.3),  Hp  =  Substituting  from  this  relation  and  (4.9)  in  (4.11),  we  obtain 

g  =  g  +  aHp  =  g  +  a{g-  A^p) 

=  (1  +  a)g  -  aA^p 
=  (1  +  a)A^v  —  aA^p  —  (1  +  a)v,a. 

=  A^X  —  t;,a,, 

where  A  =  (1  +  q.)v  —  ap  and  »,  =  (H-a)»,.  Since  v,  >  0,  v,  must  be  positive,  and 
g  has  the  desired  form.  | 

To  begin  the  induction,  note  that  if  the  multiplier  associated  with  a*  at  xq  is 
negative,  then,  from  (4.6), 

9o  =  AIp  =  A^Pq  -  o°a.,  (4.12) 

where  =  ~'A‘«  >  0-  i^^t  lemma  treats  the  other  possibility,  that  a  zero  mul¬ 
tiplier  was  associated  with  a«,  i.e.,  that  xq  is  a  stationary  point  with  respect  to  Aq- 
The  situation  is  possible  only  if  the  reduced  Hessian  associated  with  Aq  is  indefinite. 
(If  it  were  positive  definite,  the  algorithm  would  delete  further  constraints;  if  it  were 
singular,  the  algorithm  would  terminate  at  xq.) 

Lemma  4.2.  Assume  that  the  reduced  Hessian  is  indefinite  at  the  first  intermediate 
iterate  xq,  and  that  a  zero  multiplier  is  associated  with  a,.  Then 

9oPo  -  0,  PoHpo  <  0  and  alp^  >  0.  (4.13) 

If  qq  >  0,  then  gi  =  ^(xo  +  QoPo)  f^ag  be  written  as  a  linear  combination  of  o,  and 
the  rows  of  Aq.  Moreover,  there  exist  a  vector  v*  and  scalar  v]  such  that 

9i  =  (4.14) 

with  ri  >  0. 

Proof.  Since  a  zero  multiplier  is  associated  with  o*,  xq  is  a  stationary  point  with 
respect  to  Aq,  i.e.,  Jq  =  Multiplying  by  pj  shows  that  pjpo  =  0-  Using  (4.5), 

po  satisfies 

Hpo  =  -Alwf,  -  w,a„  (4.15) 

where  w,  >  0,  so  that 

j^Hpo  =  -w,alpo.  (4.16) 

Rewriting  the  definition  (4.5)  of  p  as 

Lemma  3.4  implies  that  ajpo  >  0.  It  then  follows  from  (4.16)  that  pjffpo  <  0, 
which  completes  verification  of  (4.13). 
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Now  we  assume  that  oq  >  0.  Since  =  5o  +  otoUpo,  (4.15)  and  the  relation 
9o  =  AIhq  give 

ffl  =  ffo  +  -  OlQ^O^O  - 

where  ri  =  aotn,  and  X  =  fxo  —  aoWQ.  Since  oq  >  0  and  w,  >  0,  vl  is  strictly 
positive,  and  g\  has  the  desired  form.  If  constraint  oq  is  added  to  the  working  set 
at  the  new  iterate,  g\  can  equivalently  be  written  as  in  (4.14)  by  forming  u*  from 
an  augmented  version  of  A  as  in  Lemma  4.1.  | 

We  are  now  able  to  derive  some  useful  results  concerning  the  sequence  of  inter¬ 
mediate  iterates. 

Lemma  4.3.  Given  a  sequence  of  consecutive  intermediate  iterates  {n}  satisfying 
properties  1 1-13,  the  gradient  gk  satisfies  (4-2)  for  k  >  0  if  a  constraint  with  a 
negative  multiplier  is  deleted  at  xq,  and  for  k  >  I  if  a  constraint  with  a  zero  multiplier 
is  deleted  at  xq  and  oq  >  0. 

Proof.  If  a  constraint  with  a  negative  multiplier  is  deleted  at  xq,  (4.9)  holds  at  xq 
by  definition  (see  (4.12)).  If  a  constraint  with  a  zero  multiplier  is  deleted  at  lo  and 
oo  >  0,  Lemma  4.2  shows  that  (4.9)  holds  at  xi.  Lemma  4.1  therefore  applies  at  all 
subsequent  intermediate  iterates,  where  we  adopt  the  convention  that  v  increases 
in  dimension  by  one  at  each  step  to  reflect  the  fact  that  Ak  has  one  more  row  than 
Ak-i.  I 

Lemma  4.4.  Let  {ife}  be  a  sequence  of  consecutive  intermediate  iterates  satisfying 
properties  11-13.  Given  any  vector  p  such  that  AkP  =  0,  the  following  two  properties 
hold  for  k  >  0  if  a  constraint  with  a  negative  multiplier  is  deleted  at  xq,  and  for 
k'>\  if  a  constraint  with  a  zero  multiplier  is  deleted  at  xq  and  og  >  0: 

(a)  if  gkP  <  0,  then  ajp  >  0; 

(b)  if  ajp  >  0,  then  g^p  <  0. 

Proof.  We  know  from  part  (b)  of  Lemma  4.3  that,  for  the  stated  values  of  k,  there 
exist  a  vector  and  positive  scalar  v^  such  that 

9k  =  aIv'^  -  w^o.. 

Therefore,  g^p  =  —v^a^p  and  the  desired  results  are  immediate.  | 

Lemma  4.5.  Assume  that  {x/t},  k  =  0,...,N,  is  a  sequence  of  consecutive  inter¬ 
mediate  iterates  satisfying  11-13,  where  each  Xk,  I  <  k  <  N,  is  not  a  stationary 
point  with  respect  to  ^4*.  Assume  further  that  Oq  >  0  if  a  zero  multiplier  is  deleted 
at  xo,  and  that  is  the  step  to  the  constraint  with  normal  Ojv,  which  is  added  to 
An  to  form  the  working  set  An.^.i.  Let  x^+i  =  x^,  OnPn- 

(a)  IfxN+i  is  a  stationary  point  with  respect  to  An+i,  then  On  is  linearly  dependent 
on  A^  and  a.,  and  positive  definite; 
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(b)  If  as  is  linearly  dependent  on  and  a,,  then  Xjv+i  is  a  minimizer  with 
respect  to  As+i  • 

Proof.  By  construction,  the  working  set  Am  has  full  row  rank,  so  that  a,  is  linearly 
independent  of  the  rows  of  /lo.  We  know  from  part  (b)  of  Lemma  4.3  that 

gk  =  aJv‘‘ -  vtom,  k=l,...,N,  (4.18) 

where  >  0.  Since  we  have  assumed  that  Xk  is  not  a  stationary  point  with  respect 

to  Ak  for  any  1  <  k  <  N,  (4.18)  shows  that  aj  is  linearly  independent  of  Ak- 

Furthermore,  part  (a)  of  Lemma  4.1  implies  that  there  exists  a  vector  A  such  that 

9s+i  =  Al\-v:+^a„  (4.19) 

where  >  0.  It  follows  from  the  linear  independence  of  aj  and  As  that  Xs+\ 
cannot  be  a  stationary  point  with  respect  to  the  “old”  working  set  As- 

To  show  part  (a),  assume  that  Xs+i  is  s.  stationary  point  with  respect  to  y4Ar+] 
(which  includes  as),  i-e., 

=  AsM -k  (4-20) 

where  ps  (the  multiplier  associated  with  as)  must  be  nonzero.  Equating  the  right- 
hand  sides  of  (4.19)  and  (4.20),  we  obtain 

/IJA  -  =  aIm  +  Ps°N'  (4-21) 

Since  ^  0  and  /i^sr  ^  0,  this  expression  implies  that  we  may  express  a.  as  a 
linear  combination  of  Ay  and  as,  where  the  coefficient  of  as  is  nonzero: 

<*.  =  +  -yas,  with  7  =  — ^  /  0  (4.22) 

Vm 

and  (  =  (l/vi'+‘)(A  -  p). 

Stationarity  of  ijv+i  with  respect  to  As+i  thus  implies  a  special  relationship 
among  the  most  recently  deleted  constrsunt,  the  working  set  at  and  the  newly 
encountered  constraint.  Any  nonzero  vector  p  in  the  null  space  of  satisfies 

As^iP=  (  )p  =  0-  (4.23) 

For  any  such  p,  it  follows  from  the  structure  of  (see  (4.8))  that  Aop  =  0, 

and  from  (4.22)  that  ajp  =  0;  hence,  p  lies  in  the  null  space  of  A*.  Since  Zj HZ^ 
is  positive  definite  (i.e.,  (4.7)  holds),  we  conclude  that  p^Hp  >  0  for  p  satisfying 
(4.23).  Thus,  the  reduced  Hessian  at  Xs-ki  with  respect  to  As+i  is  positive  definite, 
and  is  a  minimizer  with  respect  to  As+i- 

To  verify  part  (b),  assume  that  as  is  linearly  dependent  on  As  and  a,,  i.e., 
that  as  =  AJ,(3  -f-  a,^,,  where  fim  /  0.  Simple  rearrangement  then  gives  a,  = 
{i/fim)ay  —  {\ffim)Aj,l3.  Substituting  in  (4.19),  we  obtain 

„N+l  „N+\ 

gs+i  =  AlX-^ay--^  Alfi, 
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which  shows  that  must  be  a  stationary  point  with  respect  to  Positive¬ 

definiteness  of  the  reduced  Hessian  follows  as  before,  and  hence  Xjv+i  is  a  minimizer 
with  respect  to  | 

Lemma  4.5  is  crucial  in  ensuring  that  adding  a  constraint  in  an  inertia-controlling 
algorithm  cannot  produce  a  stationary  point  where  the  reduced  Hessian  is  not  pos¬ 
itive  definite. 

4.3.  Properties  of  the  search  direction 

When  the  reduced  Hessian  is  positive  definite,  it  is  straightforward  to  show  that 
the  search  direction  possesses  the  feasibility  and  descent  properties  discussed  in 
Section  2.3. 

Theorem  4.1.  Consider  an  iterate  x  and  a  valid  working  set  A  sueh  that  Z^HZ 
is  positive  definite.  If  p  as  defined  by  (4-IJ  is  nonzero,  then  p  is  a  descent  direction. 
Furthermore,  if  constraint  a,  is  the  most  recently  deleted  constraint,  it  also  holds 
that  a^p  >  0. 

Proof.  See  Fletcher  [FleSl,  page  89].  Writing  out  the  equations  of  (4.1),  we  have 

g  +  Up  =  A  p.  and  Ap  =  0. 

Multiplying  the  first  equation  by  p^  gives  g^p  —  -p^Hp.  Since  p  =  Zpz  for  some 
nonzero  pz  and  Z^HZ  is  positive  definite,  p^Hp  must  be  strictly  positive,  and  hence 
g^p  <  0.  If  constraint  a,  is  the  most  recently  deleted  constraint,  x  must  be  part  of 
a  sequence  of  intermediate  iterates  satisfying  properties  11-13  (Section  4.2),  where 
a  negative  multiplier  was  deleted  at  the  first  point  of  the  sequence.  Lemma  4.4  thus 
shows  that  ajp  >  0.  | 

We  now  wish  to  verify  that  the  search  direction  at  a  nonstandard  iterate  (which 
must  be  intermediate)  possesses  the  desired  properties.  Lemma  4.2  shows  that  p  is 
a  direction  of  negative  curvature  when  a  constraint  with  a  zero  multiplier  has  just 
been  deleted.  The  following  theorems  treat  the  two  possible  situations  when  the 
most  recently  deleted  constraint  has  a  negative  multiplier. 

Theorem  4.2.  When  the  reduced  Hessian  is  singular  at  a  nonstandard  iterate  x, 
the  search  direction  is  a  descent  direction  of  zero  curvature.  If  a»  is  the  most  recently 
deleted  constraint,  it  also  holds  that  ajp>  0. 

Proof.  When  Z^HZ  is  singular,  p  is  defined  by  (4.2)  and  hence  satisfies  Hp  = 
—fiA^u.  Multiplying  this  relation  by  p^,  we  obtain  j^Hp  —  0,  which  verifies  that  p 
is  a  direction  of  zero  curvature.  A  nonstandard  .'terate  x  must  be  part  of  a  sequence 
of  intermediate  iterates  satisfying  properties  11-13.  We  know  from  Lemma  4.5  that 
any  such  z  cannot  be  a  stationary  point,  and  hence  g^p  ^  0.  Thus,  the  sign  of  fi 
can  always  be  chosen  so  that  g^p  <  0.  Lemma  4.4  then  implies  that  a^p  >  0,  where 
a,  is  the  normal  of  the  most  recently  deleted  constraint.  | 


4-  Theoretical  Properties  of  Inertia- Controlling  Methods 


25 


Theorem  4.3.  When  the  reduced  Hessian  is  indefinite  at  a  nonstandard  iterate  and 
the  search  direction  is  defined  by  (4-3),  p  is  a  descent  direction  of  negative  curvature. 
If  a»  is  the  most  recently  deleted  constraint,  it  also  holds  that  ajp>  0. 

Proof.  Since  p  satisfies  Hp  +  =  g  and  Ap  =  0,  it  follows  that 

p^Hp  =  g^p.  (4.24) 

As  in  Theorem  4.2,  x  must  be  part  of  a  sequence  of  intermediate  iterates  satisfying 
properties  11-13.  Furthermore,  Lemma  4.3  shows  that 

g  =  A^v  -  t;*a*,  with  w,  >  0, 

where  a,  is  the  normal  of  the  most  recently  deleted  constraint.  Substituting  for  g 
in  (4.3)  and  rearranging,  we  see  that  p  satisfies 

and  it  follows  from  Lemma  3.4  that  a^p  >  0.  This  property  implies  first  (from 
Lemma  4.4)  that  g^p  <  0,  and  then  (from  (4.24))  that  p^Hp  <  0  as  required.  | 

If  Of  =  1  at  a  standard  iterate,  a  constraint  is  not  added  to  the  working  set  at  the 
next  iterate,  which  is  automatically  a  minimizer  with  respect  to  the  same  working 
set  (see  the  logic  for  constraint  addition  in  Figure  2).  If  a  new  iterate  happens  to  be 
a  stationary  point  under  any  other  circumstances,  we  now  show  that  the  multiplier 
corresponding  to  the  newly  added  constraint  must  be  strictly  positive. 


Lemma  4.6.  Assume  that  x  is  a  typical  intermediate  iterate,  with  associated  work¬ 
ing  set  A,  under  the  same  conditions  as  in  Lemma  4-3.  Let  x  =  i  +  ap,  where 
a  >  0  and  constraint  a  is  added  to  the  working  set  at  x,  and  let  A  denote  the  new 
working  set.  If  x  is  a  stationary  point  with  respect  to  A,  then  the  Lagrange  multiplier 
associated  with  the  newly  added  constraint  is  positive. 

Proof.  If  X  is  a  stationary  point  with  respect  to  A,  we  have  by  definition  that 
g  =  A^Pa  +  a.pa,  where  pa  is  the  multiplier  corresponding  to  the  newly  added 
constraint.  Since  the  conditions  of  this  lemma  are  the  same  as  those  of  Lemma  4.5, 

-  u,o.  =  A^\  4-  paO,  where  u,  >  0  (4.25) 

(see  (4.21)).  Lemma  4.2  and  Theorems  4.1-4.3  show  that  a^p  >  0  at  every  in¬ 
termediate  iterate.  Since  constraint  a  is  added  to  the  working  set,  we  know  that 
a^p  <  0.  Relation  (4.25)  shows  that  —v„a^p  =  PaO^P,  and  we  conclude  that  Pa  >  0 
as  desired.  | 
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4.4.  Choosing  the  initial  working  set 

Inertia-controlling  methods  require  a  procedure  for  finding  an  initial  working  set  Aq 
that  has  full  row  rank  and  an  associated  positive-definite  reduced  Hessian  ZqH Zq. 
Two  different  inertia-controlling  methods  starting  with  the  same  working  set  Aq 
will  generate  identical  iterates.  However,  procedures  for  finding  Ao  are  usually 
dependent  on  the  method  used  to  solve  the  KKT  system  and  therefore  Aq  may  vary 
substantially  from  one  method  to  another.  Ironically,  this  implies  that  different 
inertia-controlling  methods  seldom  generate  the  same  iterates  in  practice! 

In  order  to  ensure  that  the  reduced  Hessian  is  positive  definite,  the  initial  work¬ 
ing  set  may  need  to  include  “new”  constraints  that  are  not  specified  in  the  original 
problem.  These  have  been  called  temporary  constraints,  pseudo-constraints  (Fletcher 
and  Jackson  [FJ74]),  or  artificial  constraints  (Gill  amd  Murray  (GM78]).  The  only 
requirement  for  a  temporary  constraint  is  linear  independence  from  constraints  al¬ 
ready  in  the  working  set.  The  strategy  for  choosing  temporary  constraints  depends 
on  the  mechanics  of  the  particular  QP  method. 

For  example,  simple  bounds  involving  the  current  values  of  variables  are  conve¬ 
nient  in  certain  contexts  (see,  e.g.,  Fletcher  and  Jackson  [FJ74]).  Suppose  that  the 
value  of  the  first  variable  at  the  initial  point  is  (say)  6.  The  temporary  constraint 
ii  >  6  (or  -xi  >  -6)  may  be  added  to  the  initial  working  set  if  its  normal  satisfies 
the  linear  independence  criterion.  If  this  temporary  bound  is  included,  the  first 
variable  is  fixed  at  6  until  a  minimizer  is  reached.  At  a  minimizer,  the  sign  of  each 
temporary  constraint  normal  (i.e.,  the  direction  of  the  inequality)  is  chosen  so  that 
its  multiplier  is  nonpositive,  and  temporary  constraints  are  deleted  first  if  there  is 
a  choice. 

Since  a  reduced  Hessian  of  dimension  zero  is  positive  definite,  the  earliest  ap¬ 
proach  was  always  to  choose  an  initial  working  set  of  n  constraints,  regardless  of  the 
nature  of  the  reduced  Hessian  (see  Fletcher  {Fle71]  and  Gill  and  Murray  [GM78]). 
However,  this  strategy  may  be  inefficient  because  of  the  nontrivial  effort  that  must 
be  expended  to  delete  all  the  temporary  constraints. 

Ideally,  the  initial  working  set  should  be  well  conditioned  and  contain  as  few 
temporary  constraints  as  possible.  A  strategy  that  attempts  to  fulfill  these  aims 
is  used  in  the  method  of  QPSOL  [GMSW84c].  Let  A'  denote  the  subset  of  rows 
of  A  corresponding  to  the  set  of  constraints  active  at  zq.  A  trial  working  set  (the 
maximal  linearly  independent  subset  of  the  rows  of  A')  is  selected  by  computing 
an  orthogonal-triangular  factorization  in  which  one  row  is  added  at  a  time.  If  the 
diagonal  of  the  triangular  factor  resulting  from  addition  of  a  particular  constraint 
is  “too  small”,  the  constraint  is  considered  dependent  and  is  not  included. 

Let  Am-  denote  the  resulting  trial  working  set,  with  Z^  a  null-space  basis  for 
Aw  If  Z^HZy^f  is  positive  definite,  A^  is  an  acceptable  initial  working  set,  and  Ao 
is  taken  as  Aw.  Otherwise,  the  requisite  temporary  constraint  normals  are  taken  as 
the  columns  of  Zw  that  lie  in  the  subspace  spanned  by  the  eigenvectors  associated 
with  the  nonpositive  eigenvalues  of  Z'^HZyf.  With  the  TQ  factorization  (see  (6.1)), 
these  columns  can  be  identified  by  attempting  to  compute  the  Cholesky  factorization 
of  Z^HZyf  with  symmetric  interchanges  (for  details,  see  Gill  et  al.  (GMSW85]). 
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In  contrast,  methods  that  rely  on  sparse  factorizations  to  solve  KKT-related 
systems  explicitly  (see  Section  5.1)  have  more  difficulty  in  defining  Aq  efficiently, 
and  no  methods  are  known  that  attempt  to  minimize  the  number  of  temporary 
constraints. 

In  practice,  the  task  of  finding  Aq  is  often  complicated  by  the  desirability  of 
specifying  a  “target”  initial  working  set.  For  example,  the  QP  may  occur  as  a 
subproblem  within  an  SQP  method  for  nonlinearly  constrained  optimization  with  a 
“warm  start”  option;  see  Gill  et  al.  [GMSW86]. 

4.5.  Convergence 

In  all  our  discussion  thus  far,  we  have  repeatedly  assumed  at  various  crucial  junc¬ 
tures  that  a  >  0,  because  of  the  following  theoretical  (and  practical)  difficulty.  A 
degenerate  stationary  point  for  (1.1)  is  a  point  at  which  the  gradient  of  v’  is  a  linear 
combination  of  the  active  constraint  normals,  but  the  active  constraints  are  linearly 
dependent.  (A  degenerate  vertex  is  the  most  familiar  example  of  such  a  point.)  A 
degenerate  stationary  point  poses  difficulties  for  an  algorithm  in  which  constraints 
are  deleted  and  added  one  at  a  time  because  the  algorithm  may  cycle.  Although 
a  feasible  direction  of  decrease  can  be  found  by  deleting  a  single  constraint,  the 
algorithm  may  be  unable  to  move  because  each  search  direction  p  has  the  property 
that  ofp  <  0  for  an  idle  (dependent)  constraint  *,  which  means  that  the  step  to  the 
nearest  constraint  is  zero.  In  order  to  proceed,  it  may  be  necessary  to  move  “off” 
several  constraints  simultaneously,  thereby  violating  the  inertia-controlling  strategy. 
For  a  discussion  of  techniques  for  moving  away  from  degenerate  stationary  points, 
see  Fletcher  [Fle85,Fle86],  Busoval^a  [Bu885],  Dax  [Dax85],  Osborne  [Osb85],  Ryan 
and  Osborne  [R086]  and  Gill  et  al.  [GMSW]. 

Proofs  of  convergence  for  inertia-controlling  methods  if  no  degenerate  stationary 
points  exist  have  been  given  in  [Fle71,Fle81,GM78,Gou86].  We  therefore  simply 
state  the  result. 


Theorem  4.4.  If  g}(.x)  is  bounded  below  in  the  feasible  region  of  (1.1)  and  the  feasi¬ 
ble  region  contains  no  degenerate  stationary  points,  an  inertia-controlling  algorithm 
converges  in  a  finite  number  of  iterations  to  a  point  x  where 

(i)  Z^g  =  0,  Z^HZ  is  positive  definite  and  p  >  0;  or 

(ii)  Z^g  =  0,  Z^HZ  is  singular  and  p>0.  | 
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Given  the  same  initial  working  set,  inertia-controlling  methods  generate  mathemati¬ 
cally  identical  iterates.  Practical  inertia-controlling  methods  differ  in  the  techniques 
used  to  determine  the  nature  of  the  reduced  Hessian  and  to  compute  the  search  di¬ 
rection  and  Lagrange  multipliers. 
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5.1.  Using  a  nonsingular  extended  KKT  system 

When  solving  a  general  QP  with  an  inertia-controlling  method,  the  “real”  KKT  ma¬ 
trix  may  be  singular  for  any  number  of  iterations.  In  this  section,  we  show  how  to 
define  the  vectors  of  interest  in  terms  of  linear  systems  involving  a  nonsingular  ma¬ 
trix  that  (optionally)  includes  the  normal  of  the  most  recently  deleted  constraint — 
in  effect,  an  “extended”  KKT  matrix.  Fletcher’s  original  method  [Fle71]  uses  the 
approach  to  be  described,  although  he  describes  the  computations  in  terms  of  a 
partitioned  inverse.  Any  “black  box”  equation  solver  that  provides  the  necessary 
information  may  be  used  to  solve  these  equations  (see,  e.g.,  Gould  [Gou86]  and  Gill 
et  al.  [GMSW]). 

At  a  given  iterate,  let  A,  denote  either  the  current  working  set  A  or  a  matrix 
of  full  row  rank  whose  i«-th  row  is  a.  (the  most  recently  deleted  constraint)  and 
whose  remaining  rows  are  those  of  A.  (If  A,  =  A,  t.  is  taken  as  zero.)  The  row 
dimension  of  A,  is  denoted  by  m,,  which  is  m  when  A,  =  A  and  m-i- 1  when  A,  ^  A. 
Let  Z  and  Z,  be  null-space  bases  for  A  and  A,.  The  inertia-controUing  strategy 
guarantees  that  the  reduced  Hessian  ZjHZ^  is  positive  definite.  We  allow  A,  to 
be  A  only  when  Z^H Z  is  positive  definite,  in  order  to  guarantee  its  nonsingularity 
at  intermediate  iterates.  (Recall  that  Z^HZ  can  change  from  indefinite  to  singular 
following  a  constraint  addition.)  However,  it  may  be  convenient  to  retain  a.  in  A. 
even  in  the  positive- definite  case. 

The  matrix  A',  is  defined  as 


A'. 


R 

K 


(5.1) 


and  we  emphasize  that  AT.  must  be  nonsingular  (see  Corollary  3.1).  Let  u,  v,  y  and 
w  be  the  (unique)  solutions  of 


( 

( 


(5.2) 

(5.3) 

where  u  and  y  have  n  components,  v  and  w  have  m.  components,  and  e,  denotes 
the  t«-th  coordinate  vector  of  dimension  m«.  When  =  K,y  and  w  may  be  taken 
as  zero.  Any  vector  name  with  subscript  “A”  denotes  the  subvector  corresponding 
to  columns  of  A^,  and  similarly  for  the  subscript  If  *,  =  0,  the  *,-th  component 
of  a  vector  is  null. 

The  vectors  g  and  p  associated  with  the  KKT  system  (2.7)  satisfy 

Hq-k-  A^p  =  g,  Aq  =  0,  (5.4) 


so  that  9  =  u  of  (5.2)  when  K  =  K».  In  an  inertia-controlling  method,  the  search 
direction  p  is  taken  as  —q  in  the  positive-definite  case  (see  (4.1)),  as  y  in  the  singular 
case  or  in  the  indefinite  case  with  a  zero  multiplier  (see  (4.2)  and  (4.4)),  or  as  9  (see 
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(4.3) ).  Thus,  p  is  available  directly  from  (5.2)  or  (5.3)  in  two  situations:  when 
Km  =  K,  in  which  case  p  must  be  —q  since  Z^HZ  is  positive  definite;  or  when 
p  =  y.  The  next  lemma  shows  how  to  obtain  q  and  p  from  the  vectors  of  (5.2)  and 

(5.3)  when  K,  ^  K. 

Lemma  5.1.  If  K  is  nonsingular  and  Km  ^  K,  the  vectors  q  and  p  are  given  by 
q  =  u  +  /3y 

p  =  Wa  +  where  /J  = 

Proof.  Writing  out  the  equations  of  (5.2)  and  (5.3),  we  have 

Hu  +  A^Vji  +  a,»,  =  y,  v4u  =  0,  aju  =  0; 

Hy  +  OmWm  =  0,  Ay  =  0,  afy  =  1. 

For  any  scalar  /?,  the  vectors  u'  =  u-\-  fiy  and  v'  =  r  +  /8u?  satisfy 

.ffu' +  +  a,(t;,  +  jSu;,)  =  y  and  A^u' =  0. 

Both  K  and  Km  are  nonsingular,  which  implies  that  w,  /  0  (see  Corollary  3.3).  If 
0  is  chosen  as  -«,/«»„  the  coefficient  of  o.  in  (5.6)  is  zero,  and  u*  and  satisfy 

(5.4) .  The  desired  result  follows  from  the  uniqueness  of  q  and  p.  | 

When  Km  ^  K,  the  following  two  lemmas  indicate  how  to  use  n,  r,  y  and  w  to 
decide  on  the  status  of  the  reduced  Hessian  and  of  the  current  iterate. 

Lemma  5.2.  Assume  that  Km  ^  K.  Then:  (a)  ifwm  <  0,  Z^HZ  is  positive  definite; 
(b)  if  Wm  =  0,  Z^H Z  is  singular;  and  (c)  if  in,  >  0,  Z^H Z  is  indefinite. 

Proof.  Since  A,  is  chosen  so  that  ZjHZ^  is  positive  definite,  the  results  follow 
from  Corollary  3.3.  | 

Lemma  5.3.  Assume  that  Km  ^  K.  The  point  x  is  a  stationary  point  with  respect 
to  A  if  u  =  0  and  u,  =  0. 

Proof.  The  result  is  immediate  from  the  definition  of  u  and  v.  | 

5.1.1.  Updating  u,  v,  w  and  y 

The  next  four  lemmas  specify  how  u,  w,  y  and  in  can  be  recurred  from  iteration  to 
iteration.  Note  that  “old”  and  “new”  versions  of  u  and  y  ^Jways  have  n  components. 

Lemma  6.4.  (Move  to  a  new  iterate.)  Suppose  that  x  is  an  iterate  of  an  inertia¬ 
controlling  method.  Let  x  =  z  +  op.  The  solution  of 


(5.5) 


(5.6) 


(5.7) 


I 
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where  g  =  g(x)  =  g  +  ocHp,  is  given  by 

'  (1 

<  (l  +  a)r, 
(  w,  -  au», 

Va  =  Vy^-  a{aJp)wJ^. 


u  = 


f  (1  -  «)« 

(1  +  a)u  ,  w, 

u 


ifp=  -q 
ifp  =  q 
ifp  =  y 


(5.8) 


Proof.  In  this  lemma,  the  move  from  x  to  x  changes  only  the  gradient  (not  the 
working  set).  The  desired  result  can  be  verified  by  substitution  from  Lemma  5.1 
and  the  various  definitions  of  p.  | 

Following  the  addition  of  a  constraint  (say,  o)  to  the  working  set,  the  “real” 
reduced  Hessian  may  become  positive  definite,  so  that  strictly  speaking  a.  is  no 
longer  necessary.  Nonetheless,  it  may  be  desirable  to  retain  a,  in  A,  for  numerical 
reasons;  various  strategies  for  making  this  decision  are  discussed  in  [Fle71].  Updates 
can  be  performed  in  either  case,  using  the  n-vector  z  and  m,-vector  t  defined  by 

(1  '')(:)■(:)■ 

i.e.,  such  that 

H  z  +  A^tA  +  t,a»  =  a,  Az  =  0  and  a^z  =  0.  (5.10) 


We  first  consider  the  case  when  a  can  be  added  directly  to  A».  Following  the 
updates  given  in  the  next  lemma,  m,  increases  by  one  and  the  “new”  v  and  w  have 
one  additional  component. 


Lemma  5.5.  (Constraint  addition;  independent  case.)  Let  x  denote  an  iterate  of 
an  inertia-controlling  method.  Assume  that  constraint  a  is  to  be  added  to  the  working 
set  at  X,  where  Af  and  a  are  linearly  independent.  Let 


T  T 

a^u  a^y 

p  =  -3!-  and  n  =  -=-. 
a^z  a^z 


Then  the  vectors  u,  v,  y  and  w  defined  by 

«  =  «  —  pz,  V  = 

y=  y-V^7  w  = 


(■;') 
( ■;’ ) 


satisfy 

{  H  Af  a 

A. 


(5.11) 


(5.12) 


(5.13) 


If  A»  =  A,  y  and  w  have  dimension  zero,  and  are  not  updated. 
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SI 


Proof.  When  a  and  are  linearly  independent,  (5.10)  shows  that  z  must  be 
nonzero.  Since  A,^z  =  0  and  Zj HZ^  is  positive  definite,  a^z  =  z^Hz  >  0,  so  that  p 
and  T)  are  well  defined. 

For  any  scalar  p,  (5.2)  and  (5.10)  imply  that 


Aj 


^  u~  pz  \ 

^  9  '' 

v-pt  = 

0 

1  ^  / 

^  aTu  —  pa^z  ^ 

(5.14) 


The  linear  independence  of  a  and  A^  meams  that  the  solution  vectors  of  (5.13)  are 
unique.  By  choosing  p  so  that  the  last  component  of  the  right-hand  side  of  (5.14) 
vanishes,  we  see  that  u  and  v  of  (5.12)  satisfy  the  first  equation  of  (5.13).  A  similar 
argument  gives  the  updates  for  y  and  w.  | 

If  Z^HZ  is  positive  definite  and  K»  /  /f,  a,  can  be  deleted  from  A.,  and 
then  becomes  K  itself.  The  following  lemma  may  be  applied  in  two  situations: 
when  a  constraint  is  deleted  from  the  working  set  at  a  minimizer  and  the  reduced 
Hessian  remains  positive  definite  after  deletion;  and  at  an  intermediate  iterate  after 
a  constraint  has  been  added  that  makes  Z^H Z  positive  definite. 


Lemma  5.6.  (Deleting  a,  ftx>m  A,.)  Suppose  that:  x  is  an  iterate  of  an  inertia¬ 
controlling  method,  K,  /  K,  and  Z^HZ  is  positive  definite.  Then  the  vectors  u  and 
V  defined  by 

u  =  «  +  Cj/t  =  W.4  +  Cwa,  where  C  =  ,  (5-15) 


satisfy 

Hu  +  A^v  =  5,  Au  =  0.  (5.16) 


Proof.  Let  u'  =  u  -f  (y,  v'  =  v  (w  for  some  scalar  C-  Substituting  these  values 
in  (5.2),  we  have 

^(«  +  Cy)  +  A'^(va  -b  Cwa)  +  <*.(».  +  Cw.)  =  g. 

It  follows  that  (5.16)  will  be  satisfied  by  u'  and  if  v,  -|-  Cu>*  =  0.  It  is  permissible 
to  delete  a*  from  A,  only  if  Z^HZ  is  positive  definite,  which  means  that  w»  <  0, 
and  hence  C  is  well  defined.  | 

Note  that  y  and  w  are  no  longer  needed  to  define  the  search  direction  after  a, 
has  been  removed. 

The  only  remaining  possibility  occurs  when  a,  the  constraint  to  be  added,  is 
linearly  dependent  on  A^;  in  this  case,  z  =  0  in  (5.9).  We  know  from  Lemma  4.5 
that  the  iterate  just  reached  must  be  a  minimizer  xoith  inspect  to  the  working  set 
composed  of  A^  and  a,  which  means  that  o,  is  no  longer  necessary.  However,  it 
is  not  possible  to  update  u  using  Lemma  5.5  (because  a^z  =  0),  nor  to  apply 
Lemma  5.6  (because  w.  may  be  zero).  The  following  lemma  gives  an  update  that 
simultaneously  removes  a.  from  A,  and  adds  a  to  the  working  set.  After  application 
of  these  updates,  A  is  the  “real”  working  set  at  ®,  and  the  algorithm  either  terminates 
or  deletes  a  constraint  (which  cannot  be  a;  see  Lemma  4.6). 
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Lemma  5.7.  (Constraint  addition;  dependent  case.)  Suppose  that  x  is  an  iterate 
of  an  inertia-controlling  method  and  that  /v«  ^  K.  Assume  that  a  is  to  be  added 
to  the  working  set  at  x,  and  that  a  and  Af  are  linearly  dependent.  Let  A  denote  A 
with  a^  as  an  additional  row,  and  define  lj  =  v,/t».  The  vectors  u  and  v  specified 
by 

u  =  0,  Va  =  Va  Va  =  a;,  (5.17) 

where  Va  denotes  the  component  of  v  corresponding  to  a,  satisfy 

(■nm-.y 

Proof.  First,  observe  that  linear  dependence  of  AJ  and  a  means  that  z  =  0. 
Lemma  2.1  shows  that  a  cannot  be  linearly  dependent  on  A^,  which  implies  that 
^  0.  Lemma  4.5  tells  us  that  x  must  be  a  minimizer  with  respect  to  a  working 
set,  so  that  u  =  0.  The  desired  results  follow  from  substitution.  | 

The  following  lemma  mentions  a  further  efficiency  that  may  be  au:hieved  once  a 
minimizer  has  been  reached. 

Lemma  5.8.  If  an  iterate  x  is  a  minimizer  with  respect  to  A,  the  vector  u  is  zero 
for  all  subsequent  iterations. 

Proof.  When  z  is  a  minimizer  with  respect  to  a  working  set  A,  9  is  a  linear 
combination  of  the  columns  of  A^,  so  that  u  =  0.  The  result  of  the  lemma  follows 
by  noting  that  none  of  the  recurrence  relations  for  u  alters  this  value.  Hence,  only 
V,  y  and  w  need  to  be  stored  and  updated  thereafter.  | 

The  following  theorem  summarizes  the  algorithmic  implications  of  all  these  re¬ 
sults. 

Theorem  5.1.  In  an  inertia-controlling  method  based  on  using  a  nonsingular  ma¬ 
trix  iv,  as  described,  the  linear  system  (5.2)  needs  to  be  solved  explicitly  for  u  and  v 
only  once  (at  the  first  iterate);  these  vectors  can  thereafter  be  updated.  The  vectors 
y  and  w  must  be  computed  by  solving  (5.3)  at  each  minimizer,  since  w  is  used  to 
determine  the  nature  of  the  reduced  Hessian  when  a  constraint  is  deleted;  y  and  w 
may  be  updated  when  a  constraint  is  added  to  the  working  set.  The  vectors  z  and 
t  must  be  computed  by  solving  (5.9)  whenever  a  constraint  is  added  to  the  working 
set.  I 

Figures  4  and  5  specify  the  computations  associated  with  deleting  and  adding  a 
constraint  (the  boxed  portions  of  Figures  1  and  2). 

For  simplicity,  two  special  circumstances  are  not  shown:  in  Figure  4,  o,  is  always 
deleted  from  A^  when  p*  =  0  and  the  reduced  Hessian  remains  positive  definite  after 
deletion,  to  allow  the  algorithm  to  proceed  if  another  constraint  is  deleted;  and  if 
i4.  =  >4  in  Figure  5,  it  is  not  necessary  to  test  the  nature  of  the  reduced  Hessian, 
which  must  be  positive  definite. 
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a,  <-  a,;  A,  <-  A, 
compute  y  and  w  by  solving  (5.3); 
determine  the  nature  of  Z^HZ  (Lemma  5.2); 
if  positive^definite  then 

(optionally)  delete  a.  from  i4,; 
update  u  and  v  (Lemma  5.6); 
end  if 


Figure  4:  Deleting  constraint  a.  from  the  working  set. 


solve  (5.9)  to  obtain  z  and  t; 
if  2  ^  0  then 

add  a  to  A»;  update  u,  v,  y  and  w  (Lemma  5.5); 
determine  the  nature  of  Z^HZ  (Lemma  5.2); 
if  positive^definite  then 

(optionally)  delete  a,  from  /4,; 
update  u  and  v  (Lemma  5.6); 
end  if 
else  (z  =  0) 

remove  a,  from  A»  and  add  a  to  the  working  set; 
update  u  and  v  (Lemma  5.7); 
positive^definite  <—  true; 
end  if 


Figure  5:  Adding  constraint  a  to  the  working  set. 

6.  Two  Specific  Methods 

In  this  section  we  give  details  concerning  the  factorizations  used  in  implementing  two 
specific  inertia-controlling  methods.  The  method  of  Section  6.1  is  based  directly  on 
the  recurrence  relations  of  Section  5,  and  always  retains  a  positive-definite  reduced 
Hessian.  In  contrast,  the  method  of  Section  6.2  updates  a  reduced  Hessian  that  is 
allowed  to  be  positive  definite,  singular  or  indefinite. 

6.1.  Updating  an  explicit  positive-definite  reduced  Hessian 

We  now  discuss  an  algorithm  in  which  factorizations  of  A,  and  of  the  (necessarily 
positive  definite)  matrix  ZjHZ^  are  used  to  solve  the  equations  given  in  Section  5.1. 
We  consider  factorizations  of  A,  of  the  form 

A.Q.  =  A.{z.  n)  =  (o  r), 


(6.1) 
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where  T  is  a  nonsingular  m,  x  m,  matrix,  is  an  n  x  n  nonsingular  matrix,  and 
Zm  and  Y,  are  the  first  n  —  m,  and  last  m,  columns  of  Q. 

Representing  A,  by  this  factorization  leads  to  simplification  of  the  equations  to 
be  solved.  In  many  implementations,  Q,  is  chosen  so  that  T  is  triangular  (see,  e.g., 
Gill  et  al.  [GMSW84a]).  In  the  reduced-gradient  method,  Q,  is  defined  so  that  T 
is  the  usual  basis  matrix  B.  The  columns  of  Z,  form  a  basis  for  the  null  space  of 
A,.  The  columns  of  K,  form  a  basis  for  the  range  space  of  AJ  only  if  YJ =  0. 

Let  Hz  =  n  -  m^.  Let  Q  denote  the  (nonsingular)  matrix 


Q  = 


where  /  is  the  identity  of  dimension  m,.  The  nz-vector  Uz  and  the  m,-vector  Uy 
are  defined  by 


(6.2) 


Similarly, 


y  =  Q* 


and 


z  =  Q, 


)■ 


(6.3) 


Multiplying  (5.2)  by  Q'^  and  substituting  from  (6.1)  and  (6.2),  we  obtain 

/  ZjHZ^  ZjHY.  0  W 

yThz^  yJhy^  j  «y  j  =  ^  j  • 


(6.4) 


Since  T  is  nonsingular,  the  third  equation  of  the  partitioned  system  (6.4)  implies 
that  Uy  =  0,  so  that  u  and  v  are  obtained  by  solving 

ZjHZ,Uz  =  Zjg,  =  Yjg  -h  yJ HZ^Uz,  (6.5) 


and  setting  u  =  Z,Uz.  The  vectors  z  and  t  of  (5.9)  can  similarly  be  found  by  solving 


zJHZ,Zz  =  Zla, 


T^t^Yja  +  YjHZ^Zz, 


(6.6) 


and  setting  z  =  Z»zz. 

We  also  need  to  compute  the  vectors  y  and  w  of  (5.3)  at  a  minimizer.  Applying 
the  same  transformation  as  above  and  substituting  from  (6.3)  gives  the  following 
equations  to  be  solved: 

Tyy^e,,  ZjHZ,yz  =  -ZjHY.yy,  T'^w  = -YJ  Hy,  (6.7) 


where  y  =  Z.yz  -1-  Kyy. 
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By  construction,  the  reduced  Hessian  ZjHZ^  is  positive  definite;  let  its  Cholesky 
factorization  be 

ZJHZ,  =  RJR„  (6.8) 

where  R,  is  an  upper-triangular  matrix.  An  obvious  strategy  for  a  practical  imple¬ 
mentation  is  to  retain  the  matrices  7',  Z»  and  V,  and  the  Cholesky  factor  R,.  As 
the  iterations  proceed,  T,  Z,  and  V.  can  be  updated  to  reflect  changes  in  A,,  using 
Householder  transformations  or  plane  rotations  if  Q  is  orthogonal,  and  elementary 
transformations  if  Q  is  non-orthogonal;  orthogonal  transformations  are  needed  in 
part  of  the  update  for  R,  (see  Gill  et  al.  [GMSW84a]). 

For  illustration,  we  sketch  a  particular  updating  technique  in  which  T  is  chosen 
as  upper  triangular.  In  this  discussion,  barred  quantities  correspond  to  the  “new” 
working  set.  When  a  constraint  a  added  to  the  working  set,  a  becomes  the  first 
row  of  A,.  To  restore  triangular  fotm,  we  seek  a  matrix  Q  that  annihilates  the  first 
m,  —  1  elements  of  a^Q»,  i.e.,  such  that 

a^Q,Q  =  (  a^Z,  a^K  }Q  =  (o  a  )  .  (6.9) 


This  result  is  achieved  by  choosing  Q  of  the  form 


(6.10) 


where  the  m,  x  m«  matrix  P  is  composed  of  a  sequence  of  orthogonal  or  elementary 
transformations.  Substituting  from  (6.10)  into  (6.9),  we  have 

P'^Zja  =  <Ten„  (6.11) 


where  e„,  is  the  nz-th  coordinate  vector.  The  result  is  that 

Q.  =  Q.Q  =  (  z.F  n  )  =  (  z.  n  )  , 

where 

Z.P  =  (  Z.  ff  )  ,  (6.12) 

and  y.  is  F.  with  a  new  first  column  (the  transformed  last  column  of  Z.). 

When  a  constraint  is  deleted  from  A*,  the  deleted  row  is  moved  to  the  first 
position  by  a  sequence  of  cyclic  row  permutations,  which  need  be  applied  only  to  T 
and  y..  (The  columns  of  Z.  are  orthogonal  to  the  rows  of  A,  in  any  order.)  The 
first  row  of  A,  may  then  be  removed  and  the  permuted  triangle  restored  to  proper 
form  by  transformations  on  the  right  without  affecting  the  last  m*  —  1  columns  of 
Q,  or  T.  The  result  is  that  F,  is  a  row-permuted  version  of  the  last  m*  -  1  columns 
of  F,,  and  Z,  is  pven  by 

Z.  =  (  Z.  r  )  ,  (6.13) 

where  5  is  a  transformed  version  of  the  first  column  of  F*. 


36 


Inertia- Controlling  QP  Methods 


This  updating  scheme  leads  to  additional  computational  simplifications.  For 
example,  consider  calculation  of  z  and  t  from  the  first  equation  of  (6.6)  when  a 
constraint  is  added  to  /I..  Multiplying  by  substituting  from  (6.11),  and  letting 
z  =  P^Zzi  Z  =  Z,P,  we  have 

Z'^HZz  =  Z'^a  =  ae„,.  (6.14) 

The  Cholesky  factors  of  Z^HZ  will  be  available  from  the  updating  (see  (6.12)), 
and  the  special  form  of  the  right-hand  side  of  (6.14)  means  that  the  solve  with  the 
lower- triangular  matrix  ^  reduces  to  only  a  single  division. 

6.2.  Updating  a  general  reduced  Hessian 

In  this  section  we  briefly  discuss  the  method  of  QPSOL  [GMSW84c],  an  inertia¬ 
controlling  method  based  on  maintaining  an  LDL^  factorization  of  the  reduced 
Hessian 

Z'^HZ  =  LDL^,  (6.15) 

where  L  is  unit  lower  triangular  and  D  =  diag(dj ).  When  Z^H  Z  can  be  represented 
in  the  form  (6.15),  Sylvester’s  law  of  inertia  (3.4)  shows  that  In(Z^HZ)  =  In(D), 
and  our  inertia-controlling  strategy  thus  ensures  that  D  has  at  most  one  non-positive 
element.  The  following  theorem  states  that,  given  a  suitable  starting  point,  a  null- 
space  matrix  Z  exists  such  that  only  the  last  diagonal  of  D  may  be  non-positive. 

Theorem  6.1.  Consider  an  inertia-controlling  method  in  which  the  initial  iterate 
xq  is  a  minimizer.  Then  at  every  subsequent  iterate  there  exist  an  upper-triangular 
matrix  T,  a  unit  lower-triangular  matrix  L,  a  diagonal  matrix  D  and  a  null-space 
matrix  Z  with  n^  columns  such  that 

a(z  y)  =  (o  t), 

Z^ffZ  =  LDL^, 

Z'^g  =  (6.16) 

where  dj  >  0  for  j  =  1, . . . ,  -  1,  and  »«  the  ng-th  coordinate  vector. 

Proof.  An  analogous  result  is  proved  by  Gill  and  Murray  [GM78]  for  a  permuted 
form  of  the  TQ  factorization.  | 

We  emphasize  that  the  vector  Z^g  has  the  simple  form  (6.16)  only  when  the  TQ 
factorization  of  A  is  updated  with  elementary  or  plane  rotation  matrices  applied  in 
a  certain  order.  In  this  sense,  the  method  depends  critically  on  the  associated  linear 
algebraic  procedures. 

The  search  direction  p  is  always  taken  as  a  multiple  of  Zpx,  where  pz  is  the 
unique  nonzero  vector  satisfying 

L^Pz=en,.  (6.17) 


The  special  structures  of  D  and  the  reduced  gradient  are  crucial  to  the  following 
theorem. 
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Theorem  6.2.  Assume  that  the  results  of  Theorem  6.1  hold,  and  that  Z,  L  and 
D  are  the  corresponding  matrices.  Let  pz  be  the  solution  of  L^pz  =  Cn,  •  Then  the 
vector  p  =  Zpz  is  a  multiple  of  q  of  (5.4)  if  9  /  0  and  dn,  /  0,  and  is  a  multiple 
of  y  of  (5.3)  if  either  (a)  Z^g  ^  0  and  dn,  =  0,  or  (b)  Z^g  =  0  and  dn,  <  0. 

Proof.  In  all  cases,  the  definition  (6.17)  of  pz  and  the  structure  of  L  and  D  imply 
that 

LDL^Pz  =  dn,en,.  (6.18) 

First,  assume  that  Z^g  ^  0  and  dn,  ^  0,  so  that  Z^HZ  is  nonsingular  and  q  is 
unique.  Recall  that  q  =  Zqz,  where  Z^HZqz  —  Z^g.  We  know  from  Theorem  6.1 
that  Z^HZ  =  Z.Z)X^  and  Z^g  =  acn,,  with  a  5^  0  by  hypothesis.  Relation  (6.18) 
and  the  uniqueness  of  p  and  q  thus  imply  that  each  is  a  multiple  of  the  other,  as 
required. 

We  now  treat  the  second  case,  Z^g  ^  0  and  dn,  =  0,  so  that  Z^HZ  is  singular. 
The  vector  y  of  (5.3)  can  be  written  as  y  =  Zyz,  where  yz  is  a  nonzero  vector 
satisfying  Z^HZyz  —  0.  (Recall  that  Z^HZ  has  exactly  one  zero  eigenvalue.)  Since 
dn,  =  0,  (6.18)  gives 

LDL^Pz  =  Z'^HZpz  =  0, 

as  required. 

Finally,  assume  that  Z^g  =  0  and  dn,  <  0,  which  occurs  when  the  reduced 
Hessian  becomes  indefinite  immediately  following  deletion  of  a  constraint  with  a  zero 
multiplier.  Let  a.  be  the  normal  of  the  deleted  constraint  with  the  zero  multiplier. 
The  vector  y  of  (5.3)  is  given  by  y  =  Zyz,  where  yz  satisfies 

Z^HZyz  =  -w^Z^a,  and  ajy=l,  (6.19) 

with  w,  >  0.  The  nature  of  the  updates  to  Z  following  a  constraint  deletion  (see 
(6.13))  shows  that  the  vector  Z^a,  is  given  by 

Z^a,  =  (6.20) 

where  ^  =  ajz,  with  z  the  new  column  of  Z  created  by  the  deletion  of  a,.  Because 
of  the  full  rank  of  the  working  set,  $  0.  Thus,  yz  satisfies 

Z^HZyz  =  5^  0.  (6.21) 

It  follows  from  (6.18)  that  either  p  or  — p  is  a  direction  of  negative  curvature, 
since 

p'^Hp  =  plZ'^HZp^  =  dn,  <  0. 

If  the  sign  of  p„,  (the  last  component  of  pz)  is  chosen  so  that 

alp  =  aJZpz  =  ^Pn,  >  0, 

then  examination  of  (6.18),  (6.19)  and  (6.21)  implies  that  p  is  a  multiple  of  y,  as 
required.  | 
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7.  Conclusions  and  Topics  for  Further  Research 

This  paper  has  explored  in  detail  the  nature  of  a  family  of  methods  for  general 
quadratic  programming.  Our  aims  have  been  to  describe  the  overall  “feel”  of  an  ide¬ 
alized  active-set  strategy  (Section  2),  to  provide  theoretical  validation  of  the  inertia¬ 
controlling  strategy  (Section  3),  to  formulate  in  a  uniform  notation  the  equations 
satisfied  by  the  search  direction  (Section  4),  and  to  discuss  selected  computational 
aspects  of  inertia-controlling  methods  (Section  5  and  Section  6). 

Many  interesting  topics  remain  to  be  explored,  particularly  in  the  efficient  im¬ 
plementation  of  these  methods.  For  example,  the  method  of  Section  6.1  is  identical 
in  motivation  to  Fletcher’s  original  method  [Fle71],  but  has  not  been  implemented 
in  the  form  described,  which  avoids  the  need  to  update  factors  of  a  singular  or 
indefinite  symmetric  matrix.  Various  methods  for  sparse  quadratic  programming 
could  be  devised  based  on  the  equations  of  Section  5.1,  in  addition  to  those  already 
suggested  by  Gould  [Gou86]  and  Gill  et  al.  [GMSW]. 

As  noted  in  Section  4.4,  an  open  question  remains  concerning  the  crucial  task 
of  finding  an  initial  working  set  in  an  eflficient  fashion  consistent  with  the  linear 
algebraic  procedures  of  the  main  iterations. 

Acknowledgement.  The  authors  are  grateful  to  Nick  Gould  and  Anders  Fors- 
gren  for  many  helpful  discussions  on  quadratic  programming.  We  also  thank  Dick 
Cottle  for  his  bibliographical  assistance. 

8.  References 

[And74]  T.  Ando.  Generalized  Schur  complements.  Linear  Algebra  and  it*  Applications,  27, 
17.1-186,  1974. 

[Bes84]  M.  J.  Best.  Ekiuivalence  of  some  quadratic  programming  algorithms.  Mathematical 
Programming,  30,  71-87,  1984. 

[BK80}  J.  R.  Bunch  and  L.  C.  Kaufman.  A  computational  method  for  the  indefinite  quadratic 

programming  problem.  Linear  Algebra  and  its  Applications,  34,  341-370,  1980. 

[Bus85]  S.  Busovaifa.  Handling  degeneracy  in  a  nonlinear  /|  algorithm.  Technical  Report  CS- 
85-34,  Department  of  Computer  Science,  University  of  Waterloo,  Waterloo,  Canada, 
1985. 

[Car86]  D.  Carlson.  What  are  Schur  complements,  anyway?  Linear  Algebra  and  it*  Applica¬ 
tions,  74,  257-275,  1986. 

[CD79]  R.  W.  Cottle  and  A.  Djang.  Algorithmic  equivalence  in  quadratic  programming,  I:  a 
least  distance  programming  problem.  J.  Optimisation  Theory  and  Applications,  28, 
275-301,  1979. 

[CHM74]  D.  Carlson,  E.  Haynsworth,  and  T.  Markham.  A  generalization  of  the  Schur  comple¬ 
ment  by  means  of  the  Moore-Penroee  inverse.  SIAM  J.  on  Applied  Mathematics,  26, 
169-175,  1974. 

[ConSO]  L.  B.  Contesse.  Une  caract^risation  complete  des  minima  locanx  en  programmation 
quadratiqne.  Numerische  Mathematik,  34,  315-332,  1980. 

[Cot74]  R.  W.  Cottle.  Manifestations  of  the  Schnr  complement.  Lineor  Algebra  and  its 
Applications,  8,  189-211,  1974. 

[Dax85]  A.  Dax.  The  computation  of  descent  directions  at  degenerate  points.  Technical  report. 
Hydrological  Service,  PO  Box  6381,  Jerusalem,  Israel,  1985. 


8.  References 


39 


[FJ74]  R.  Fletcher  and  M.  P.  Jackson.  Minimization  of  a  quadratic  function  of  many  vari¬ 

ables  subject  only  to  upper  and  lower  bounds.  J.  Institute  of  Mathematics  and  its 
Applications,  14,  159-174,  1974. 

[Fle71]  R.  Fletcher.  A  general  quadratic  programming  algorithm.  J.  Institute  of  Mathematics 
and  its  Applications,  7,  76-91,  1971. 

[FleSl]  R.  Fletcher.  Practical  Methods  of  Optimization.  Volume  2:  Constrained  Optimization, 

John  Wiley  and  Sons,  Chichester  and  New  York,  1981. 

[Fle85]  R.  Fletcher.  Degeneracy  in  the  presence  of  round-off  errors.  Numerical  Analysis 

Report  NA89,  Department  of  Mathematical  Sciences,  University  of  Dundee,  Scotland, 
1985. 

[Fle86]  R.  Fletcher.  Recent  developments  in  linear  and  quadratic  programming.  Numerical 

Analysis  Report  NA94,  Department  of  Mathematical  Sciences,  University  of  Dundee, 
Scotland,  1986. 

[GM78]  P.  E.  Gill  and  W.  Murray.  Numerically  stable  methods  for  quadratic  programming. 
Mathematical  Programming,  14,  349-372,  1978. 

[GMSW]  P.  E.  Gill,  W.  Murray,  M.  A.  Saunders,  and  M.  H.  Wright.  General  sparse  quadratic 
programming,  to  appeu. 

[GMSW84a]  P.  E.  Gill,  W.  Murray,  M.  A.  Saunders,  and  M.  H.  Wright.  Procedures  for  optimization 
problems  with  a  mixture  of  bounds  and  general  linear  constraints.  ACM  Transactions 
on  Mathematical  Software,  10,  282-298,  1984. 

[GMSW84b]  P.  E.  Gill,  W.  Murray,  M.  A.  Saunders,  and  M.  H.  Wright.  Sparse  matrix  methods 
in  optimization.  SIAM  J.  on  Scientific  and  Statistical  Computing,  5,  562-589,  1984. 

[GMSW84c]  P.  E.  Gill,  W.  Murray,  M.  A.  Saunders,  and  M.  H.  Wright.  User's  Guide  for 
SOL/QPSOL  (Version  3.2).  Report  SOL  84-6,  Department  of  Operations  Research, 
Stanford  University,  1984. 

[GMSW85]  P.  E.  Gill,  W.  Murray,  M.  A.  Saunders,  and  M.  H.  Wright.  Software  and  its  relation¬ 
ship  to  methods.  In  P.  T.  Boggs,  R.  H.  Byrd,  and  R.  B.  Schnabel,  editors.  Numerical 
Optimization  1984,  pages  139-159,  SIAM,  Philadelphia,  1985. 

[GMSW86]  P.  E.  Gill,  W.  Murray,  M.  A.  Saunders,  and  M.  H.  Wright.  User's  Guide  for  NPSOL 
(Version  4-0):  a  Fortran  package  for  nonlinear  programming.  Report  SOL  86-2, 
Department  of  Operations  Research,  Stanford  University,  1986. 

[GMSW87]  P.  E.  Gill,  W.  Murray,  M.  A.  Saunders,  and  M.  H.  Wright.  A  Schur- Complement 
method  for  sparse  quadratic  programming.  Report  SOL  87-12,  Department  of  Opera¬ 
tions  Research,  Stanford  University,  1987. 

[GMSW88]  P.  E.  Gill,  W.  Murray,  M.  A.  Saunders,  and  M.  H.  Wright.  A  practical  anti-cycling 
procedure  for  linear  and  nonlinear  programming.  Report  SOL  88-4,  Department  of 
Operations  Research,  Stanford  University,  1988. 

[Oou85]  N.  I.  M.  Gould.  On  practical  conditions  for  the  existence  and  uniqueness  of  solutions 
to  the  general  equality  quadratic  programming  problem.  Mathematical  Programming, 
32,  90-99,  1985. 

[Gou86]  N.  1.  M.  Gould.  An  algorithm  for  large-scale  quadratic  programming.  Report  CSS 
219,  AERE  Harwell,  United  Kingdom,  1986. 

[Ilay68]  E.  V.  Haynsworlh.  Determination  of  the  inertia  of  a  partitioned  Hermitian  matrix. 
Linear  Algebra  and  its  Applications,  1,  73-81,  1968. 

[Hoy86]  S.  C.  Hoyle.  A  single-phase  method  for  quadratic  programming.  Report  SOL  86-9, 
Department  of  Operations  Research,  Stanford  University,  1986. 

[Maj71]  A.  Majthay.  Optimality  conditions  for  quadratic  programming.  Mathematical  Pro¬ 
gramming,  1,  359-365,  1971. 

[MK87]  K.  G.  Murty  and  S.  N.  Kabadi.  Some  NP-complete  problems  in  quadratic  and  non¬ 
linear  programming.  Mathematical  Programming,  39,  117-129,  1987. 


40 


Inertia-Controlling  QP  Methods 


[Mur71] 

[Osb85] 

[PS88] 

[R086] 

[Wiles] 


W.  Murray.  An  algorithm  for  finding  a  local  minimum  of  an  indefinite  quadratic 
program.  Report  NAC  I,  National  Physical  Laboratory,  England,  1971. 

M.  R.  Osborne.  Finite  Algorithms  in  Optimization  and  Data  Analysis.  John  Wiley 
and  Sons,  Chichester  and  New  York,  1985. 

P.  M.  Pardalos  and  G.  Schnitger.  Checking  local  optimality  in  constrained  quadratic 
programming  is  NP-hard.  Operations  Research  Letters,  7,  33-35,  1988. 

D.  M.  Ryan  and  M.  R.  Osborne.  On  the  solution  of  highly  degenerate  linear  programs. 
Mathematical  Programming,  41,  385-392,  1986. 

J.  H.  Wilkinson.  The  Algebraic  Eigenvalue  Problem.  The  Clarendon  Press,  Oxford, 
1965. 


UNCLASSIFIED 


McuwTv  ctomwoTioii  Twit  furnm  0— 


1  REPORT  DOCUMENTATION  PAGE 

RBAO  msTinfcnain  1 

■BTOn  OOMPLBTINO  roan  1 

1.  MPMf  Nufim 

SOL  88-3 

1.  aevT  ACcnBioM  HO. 

«.  TITUl  (m^  fclMItilt 

Inertia-Controlling  Methods 
for  Quadratic  Programming 

•.  TVOC  or  RVOHT  A  HCMoe  COVBMO 

Tachnical  Raport 

A.  MmroHHIMO  OHO.  NBHOHT  NUMABH 

T.  AUTNOUt^l) 

Philip  E.  Gill,  Walter  Murray, 

Michael  A.  Saunders  and  Margaret  H.  Wright 

A.  cdNTHACr  oh  dHAKf  MUMatN^iO  ~  ~ 

N00014-87-K-0142 

Department  of  Operations  Research 
Stanford  University 

Stanford,  CA  94305-4022 

-  SOL 

W.  aOOOHAH  ^BMaMT.ailOJICT.  TAW 
AMA  A  Beak  UNIT  MUMacas 

llllMA 

n.  eeiiTiiei.uNa  or  net  mamb  and  *oe«SM 

Office  of  Naval  Research  -  Dept,  of  the  Navy 

800  N.  Quincy  Street 

Arlington,  VA  22217 

IB.  asaearoATB 

November  1988 

IB.  HUHBBa  OP  BABBA 

40  pages 

IB.  SBCuaiTT  CLASS,  (ttmim  mbmO 

UNCLASSIFIED 

!«.  Aitf  waunoN  •tatimbiit  (m  m* 

This  docunent  has  baan  approvad  for  public  ralaasa  and  sale; 

Its  distribution  Is  unllmltad. 

17.  DIfTlliauTION  tT ATBIIBNT  f«f  MM  iAtawt  aaMMa 

«•.  •wm>Ln*CNTAirv  notu 

I*.  KKV  aOWO<fCwlA»l<«>W>— •MAIfW— »>ia>AliaiAAia>»lr  Al— 

quadratic  programming,  indefinite  quadratic  programs, 
active-set  method,  inertia. 

aa.  AtTWACT  fCiiiBBi  —  nmmm  w  — M— r 

(Please  see  reverse  side) 

DO  ijSTn  1473  cnvieMor  iNovWMeMOkinrc 


$aeumTr  CUAWIWCATIOII  0»  TWI«  Omm 


MCUWITV  CtAMIIHCATIOM  OF  TWIt 


Inertia-Controlling  Methods 
for  Quadratic  Programming 

by  Philip  E.  Gill,  Walter  Murray, 
Michael  A.  Saunders  and  Margaret  H.  Wright 

Technical  Report  SOL  88-3  Abstract 


Active-set  quadratic  programming  (QP)  methods  use  a  working  set  to  define  the  search  direction 
and  multiplier  estimates.  In  the  method  proposed  by  Fletcher  in  1971,  and  in  several  subsequent 
mathematically  equivalent  methods,  the  working  set  is  chosen  to  control  the  inertia  of  the  reduced 
Hessian,  which  is  never  permitted  to  have  nrK>re  than  one  nonpositive  eigenvalue.  (We  call  such  methods 
inertia-controlling.)  This  paper  presents  an  overview  of  a  generic  inertia-controlling  QP  method, 
including  the  equations  satisfied  by  the  search  direction  when  the  reduced  Hessian  is  positive  definite, 
singular  and  indefinite.  We  also  derive  recurrence  relations  that  facilitate  the  efficient  implementation  of 
a  class  of  inertia-controlling  methods  that  maintain  the  factorization  of  a  nonsingular  matrix  associated 
with  the  Karush-Kuhn-Tucker  conditions. 
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